Labman42/Megatron-DiffuLM: Diffusion Language Model training (MDLM, BD3LM, EditFlow, A2D) on Megatron-LM with full TP/PP/DP/CP parallelism support

Diffusion Language Model training (MDLM, BD3LM, EditFlow, A2D) on Megatron-LM with full TP/PP/DP/CP parallelism support

Read Original

Related