Papers with Code paper Jun 2

Can Generalist Agents Automate Data Curation?

Curating training data is among the most consequential yet labor-intensive parts of modern AI development: practitioners iteratively propose, implement, evaluate, and revise data p...

Papers with Code paper Jun 2

Large Language Models Hack Rewards, and Society

Reinforcement learning (RL) has become a dominant post-training paradigm, enabling large language models (LLMs) to learn from rewards. We observe that societal regulations are stru...

Papers with Code paper Jun 2

Qwen-Image-Flash: Beyond Objective Design

Few-step distillation has become an effective strategy for accelerating advanced visual generative models, yet prior work has largely focused on distillation objectives. In this wo...

Papers with Code paper Jun 2

MemTrain: Self-Supervised Context Memory Training

Memory is an indispensable capability for long-horizon LLM agents, enabling them to preserve and utilize information accumulated across extended interactions. Existing memory-agent...