/// AI HUB
Dashboard News Models Tools Papers Repos Videos Companies Trending
Login

Papers

Latest Trending Top
Papers with Code paper May 27

SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control

The narrative quality of a video fundamentally determines its perceptual value. Although existing video generation methods can produce visually appealing content, they predominantl...

21
Papers with Code paper May 27

AsyncTool: Evaluating the Asynchronous Function Calling Capability under Multi-Task Scenarios

Large language model (LLM)-based agents have shown strong capabilities in using external tools to solve complex tasks. However, existing evaluations often overlook the temporal dim...

21
Papers with Code paper May 27

Pruning and Distilling Mixture-of-Experts into Dense Language Models

Mixture-of-Experts (MoE) is now the dominant architecture for frontier language models, yet it requires all expert parameters to be loaded in memory, making it less preferable for ...

21
Papers with Code paper May 27

Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models

Spatial intelligence requires visual representations that capture both semantic objects and geometric structure in the physical world. To support this, two major pre-training schem...

Multimodal
21
Papers with Code paper May 27

A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks

As agent capabilities advance, existing benchmarks, such as τ^2-Bench, are becoming increasingly saturated. Yet constructing new benchmark tasks remains complex, costly, and labor-...

21
Papers with Code paper May 27

Frequency-Guided Action Diffusion via Sub-Frequency Manifold Traversal

Learning visuomotor policies via behavior cloning typically involves mimicking expert demonstrations collected by human operators. However, natural human demonstrations inherently ...

21
Papers with Code paper May 27

Comprehensive Benchmarking of Long-Form Speech Generation in Diverse Scenarios

Recent advances in speech generation have enabled high-fidelity synthesis, yet systematic evaluation of models under long-context conditions remains largely underexplored. A compre...

21
Papers with Code paper May 27

Models That Know How Evaluations Are Designed Score Safer

The validity of AI safety evaluations depends on models behaving consistently across controlled and deployment settings. Prior work has identified test-time contextual cues, such a...

21
Papers with Code paper May 27

FRAPPE: Full Input, Residual Output Autoencoding with Projection Pursuit Encoder

Media compression standards have reached a plateau in terms of the rate-distortion-complexity trade-off, limiting the ability to offload expensive AI perception to the cloud in app...

21
Papers with Code paper May 27

ESC-Skills: Discovering and Self-Evolving Skills for Emotional Support Conversations

Existing emotional support conversation (ESC) systems mainly rely on end-to-end response generation or coarse strategy supervision, offering limited interpretability and little sup...

21
Papers with Code paper May 27

DenoiseRL: Bootstrapping Reasoning Models to Recover from Noisy Prefixes

Reinforcement learning has become a central paradigm for advancing reasoning in large language models, yet most existing methods still depend on stronger teacher models or heavily ...

21
Papers with Code paper May 27

Review Arcade: On the Human Alignment and Gameability of LLM Reviews

LLM-generated reviews for scientific papers are gaining considerable traction and are even being officially piloted by major conferences. We have to assume that not only reviewers ...

LLM Safety/Alignment
21
Papers with Code paper May 27

Augmenting Attention with Exponentially Decaying Memory Improves Query-Aware KV Sparsity

Efficient inference is critical for long-context language models, where attention computation and KV-cache access dominate the cost. Recent work RAT+, introduces a recurrence-augme...

21
Papers with Code paper May 27

Rethinking Memory as Continuously Evolving Connectivity

Existing memory-augmented LLM agents often treat memory as a static repository with pre-defined representations and fixed retrieval pipelines, which is brittle in dynamic agentic e...

21
Papers with Code paper May 27

MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems

Memory is essential for enabling large language models to support long-horizon reasoning, yet existing memory systems remain unreliable and difficult to debug. Tracing memory's dyn...

LLM
21
Papers with Code paper May 27

GEM: Generative Supervision Helps Embodied Intelligence

Embodied Vision-Language Models (VLMs) have demonstrated impressive performance and generalization in robotics, particularly within Vision-Language-Action frameworks. However, a si...

21
Papers with Code paper May 27

Skill0.5: Joint Skill Internalization and Utilization for Out-of-Distribution Generalization in Agentic Reinforcement Learning

Equipping large language models with explicit skills has emerged as a promising paradigm for enabling autonomous agents to solve complex tasks. Agent skills can be inherently divid...

Agents
21
Papers with Code paper May 27

Long Live The Balance: Information Bottleneck Driven Tree-based Policy Optimization

Recent advances in online reinforcement learning (RL) for large language models (LLMs) have demonstrated promising performance in complex reasoning tasks. However, they often exhib...

21
Papers with Code paper May 27

OmniVerifier-M1: Multimodal Meta-Verifier with Explicit Structured Recalibration

Visual outcomes are increasingly central to multimodal large language models, making reliable and fine-grained verification essential for scaling generalist foundation models. In t...

Multimodal
21
Papers with Code paper May 27

LiveBrowseComp: Are Search Agents Searching, or Just Verifying What They Already Know?

Are LLM-based search agents genuinely searching, or using the web to verify what they already know? We study this question on BrowseComp with three diagnostics. Our analysis reveal...

21
Papers with Code paper May 27

When Confidence Misleads: Suffix Anchoring and Anchor-Proximity Confidence Modulation for Diffusion Language Models

Diffusion language models decode text by iteratively denoising masked token sequences, making the choice of which positions to decode a central inference-time decision. Most traini...

21
Papers with Code paper May 27

AlphaTransit: Learning to Design City-scale Transit Routes

Designing a transit network requires many sequential route extension decisions, but their quality is often visible only after the full network is assembled. This delayed-feedback c...

21
Papers with Code paper May 27

The Fragility of Chain-of-Thought Monitoring Across Typologically Diverse Languages

Chain-of-thought (CoT) monitoring has been proposed as a promising safety mechanism for detecting misaligned behavior in large language models. However, its reliability remains lar...

21
Papers with Code paper May 27

GUI-CIDER: Mid-training GUI Agents via Causal Internalization and Density-aware Exemplar Reselection

Despite the rapid progress of multimodal large language models in building Graphical User Interface (GUI) agents, their real-world task completion is fundamentally bottlenecked by ...

21
« Previous Page 31 of 118 (2828 items) Next »
AI Hub // AI Intelligence Platform // LIVE FEED // Impressum // Datenschutz © 2026
0 new articles available