pablo-reyes8/deepseekv4-mini-pytorch: From-scratch, paper-faithful PyTorch implementation of DeepSeek-V4 core architecture for transparent study, testing, ablation, and mini-scale training.

From-scratch, paper-faithful PyTorch implementation of DeepSeek-V4 core architecture for transparent study, testing, ablation, and mini-scale training.

Read Original

Related