Papers

Latest Trending Top

Papers with Code paper Jul 13

Are LLMs Ready for Scientific Discovery? A Capability-Oriented Benchmark for AI Scientists

Existing benchmarks for scientific data analysis evaluate LLMs primarily on code execution or workflow completion, overlooking that scientific analysis serves to support distinct t...

Benchmark

Papers with Code paper Jul 13

Know Before Fix: QA-Driven Repository Knowledge Acquisition for Software Issue Resolution

LLM-based coding agents have significantly advanced automated software issue resolution, yet they remain highly prone to factual errors caused by insufficient repository understand...

Papers with Code paper Jul 13

SVR-R1: Bootstrapping Multi-modal Reasoning with Self-verification in Reinforcement Learning

We introduce Self-Verified Reasoner (SVR-R1), a multi-turn RL framework that turns a model's own verification into a learning signal for multimodal reasoning. For each query, the m...

Papers with Code paper Jul 13

Motion4Motion: Motion Transfer Across Subjects at Inference

This work explores the motion transfer from one video to another, which is crucial in animation for diverse characters. Previously, video motion transfer has been largely explored ...

Papers with Code paper Jul 13

Proxy Exploration and Reusable Guidance: A Modular LLM Post-Training Paradigm via Proxy-Guided Update Signals

Post-training is essential for refining the domain-specific capabilities of large language models (LLMs), yet existing reward optimization and distribution matching methods tightly...

LLM

Papers with Code paper Jul 13

AdvancedMathBench: A Benchmark Suite for Advanced Mathematical Proof Generation and Verification

Large language models (LLMs) have achieved remarkable performance on high-school and olympiad-style mathematics, yet their capabilities on advanced mathematics remain poorly unders...

Benchmark

Papers with Code paper Jul 13

LightMem-Ego: Your AI Memory for Everyday Life

Personal AI assistants on mobile and wearable devices continuously perceive users' daily lives through visual and audio streams. However, answering queries about past experiences r...

Papers with Code paper Jul 13

MET: Theory-Grounded and Culture-Aware Multilingual Moral Reasoning

Language models are increasingly used for moral decision-making across diverse linguistic and cultural contexts, yet existing work overlooks multilinguality on three aspects: 1) mu...

Papers with Code paper Jul 13

Read It Back: Pretrained MLLMs Are Zero-Shot Reward Models for Text-to-Image Generation

In this paper, we propose SpectraReward, a training-free reward function that turns pretrained MLLMs into off-the-shelf reward models for image-generation reinforcement learning. I...

Image Generation

Papers with Code paper Jul 13

RAGU: A Multi-Step GraphRAG Engine with a Compact Domain-Adapted LLM

Graph retrieval-augmented generation (GraphRAG) enhances large language models with structured knowledge, yet existing systems construct knowledge graphs in a single extraction pas...

LLM

Papers with Code paper Jul 13

Qwen-Music Technical Report

In this report, we introduce Qwen-Music, a powerful music generation model capable of producing highly musical and high-fidelity songs with complete vocal singing. Qwen-Music suppo...

Papers with Code paper Jul 12

LATO.2: Factorized 3D Mesh Generation with Vertex and Topology Flow

Flow matching over carefully designed latent representations has recently emerged as a powerful paradigm for topology-aware mesh generation. Existing approaches, however, model ver...

Papers with Code paper Jul 12

Predictive Divergence Masks for LLM RL

Reinforcement learning for large language models (LLMs) typically relies on trust-region masks to stabilize off-policy updates. The dominant PPO-style approach uses the sampled-tok...

LLM

Papers with Code paper Jul 12

Towards Autonomous and Auditable Medical Imaging Model Development

Large language model (LLM) agents are beginning to automate machine learning engineering (MLE) by coupling planning, code execution, debugging, and empirical feedback. Translating ...

Papers with Code paper Jul 11

GigaChat Audio: Time-aware Large Audio Language Model

Temporal grounding in long recordings remains challenging for audio-conditioned LLMs. We present a time-aware audio LLM that answers questions with explicit timestamps over up to 1...

LLM

Papers with Code paper Jul 11

GRASP: GRanularity-Aware Search Policy for Agentic RAG

Agentic retrieval-augmented generation (RAG) extends static RAG by allowing language models to iteratively reason, generate search queries, retrieve evidence, and predict answers. ...

RAG Agents

Papers with Code paper Jul 11

SynthDocBench: Controlled Benchmark for Long-Context Visual Document Understanding

Vision language models (VLMs) have achieved strong performance on visual document understanding benchmarks such as DocVQA, ChartQA, and MMLongBench-Doc. However, real-world documen...

Benchmark

Papers with Code paper Jul 11

ABot-AgentOS: A General Robotic Agent OS with Lifelong Multi-modal Memory

Recent VLM and VLA systems have improved robotic perception and action prediction, yet long-horizon embodied agents still require a general runtime layer for reasoning, memory, too...

Papers with Code paper Jul 11

ABot-N1: Toward a General Visual Language Navigation Foundation Model

Visual Language Navigation foundation models aim to unify deep reasoning for grounded spatial decisions with broad versatility for diverse embodied tasks. Current approaches typica...

LLM

Papers with Code paper Jul 11

Beyond Euclidean Clipping: Overcoming Exploration Collapse in LLM RL via Riemannian Isometric Policy Optimization

Reinforcement learning (RL) has become a dominant paradigm for enhancing LLMs' reasoning capabilities. However, RL algorithms with PPO-Clip are inherently limited by exploration co...

LLM

Papers with Code paper Jul 11

GigaAM Multilingual: Foundation Model for Underrepresented Languages

Despite recent scaling successes, multilingual ASR performance remains highly uneven, with long-tail languages suffering from severe data scarcity. This work addresses the challeng...

LLM

Papers with Code paper Jul 10

Are LLMs Ready for Scientific Discovery? A Capability-Oriented Benchmark for AI Scientists

Know Before Fix: QA-Driven Repository Knowledge Acquisition for Software Issue Resolution

SVR-R1: Bootstrapping Multi-modal Reasoning with Self-verification in Reinforcement Learning

Motion4Motion: Motion Transfer Across Subjects at Inference

Proxy Exploration and Reusable Guidance: A Modular LLM Post-Training Paradigm via Proxy-Guided Update Signals

AdvancedMathBench: A Benchmark Suite for Advanced Mathematical Proof Generation and Verification

LightMem-Ego: Your AI Memory for Everyday Life

MET: Theory-Grounded and Culture-Aware Multilingual Moral Reasoning

Read It Back: Pretrained MLLMs Are Zero-Shot Reward Models for Text-to-Image Generation

RAGU: A Multi-Step GraphRAG Engine with a Compact Domain-Adapted LLM

Qwen-Music Technical Report

LATO.2: Factorized 3D Mesh Generation with Vertex and Topology Flow

Predictive Divergence Masks for LLM RL

Towards Autonomous and Auditable Medical Imaging Model Development

GigaChat Audio: Time-aware Large Audio Language Model

GRASP: GRanularity-Aware Search Policy for Agentic RAG

SynthDocBench: Controlled Benchmark for Long-Context Visual Document Understanding

ABot-AgentOS: A General Robotic Agent OS with Lifelong Multi-modal Memory

ABot-N1: Toward a General Visual Language Navigation Foundation Model

Beyond Euclidean Clipping: Overcoming Exploration Collapse in LLM RL via Riemannian Isometric Policy Optimization

GigaAM Multilingual: Foundation Model for Underrepresented Languages

CtrlVTON: Controllable Virtual Try-On via Visual-Instance-Prompt Segmentation

4D Human-Scene Reconstruction from Low-Overlap Captures

REBASE: Reference-Background Subspace Elimination for Training-Free In-Context Segmentation