flashserve/Mosaic: MOSAIC: Unlocking Over 30× Context Length for Diffusion LLMs Inference via Global Memory Planning and Dynamic Peak Taming

MOSAIC: Unlocking Over 30× Context Length for Diffusion LLMs Inference via Global Memory Planning and Dynamic Peak Taming

Read Original

Related