Papers with Code paper 4d ago

Kwai Keye-VL-2.0 Technical Report

We introduce Kwai Keye-VL-2.0-30B-A3B, an open-source Mixture-of-Experts (MoE) multimodal foundation model designed to advance long-video understanding and agentic intelligence. To...

Papers with Code paper 5d ago

Rethinking the Divergence Regularization in LLM RL

Reinforcement learning (RL) has become a key component of post-training large language models (LLMs). In practice, LLM RL is often off-policy because of training-inference mismatch...