Benchmarking


ActionParty: Multi-Subject Action Binding in Generative Video Games

Add code
Apr 02, 2026
Viaarxiv icon

Modulate-and-Map: Crossmodal Feature Mapping with Cross-View Modulation for 3D Anomaly Detection

Add code
Apr 02, 2026
Viaarxiv icon

Steerable Visual Representations

Add code
Apr 02, 2026
Viaarxiv icon

Grounded Token Initialization for New Vocabulary in LMs for Generative Recommendation

Add code
Apr 02, 2026
Viaarxiv icon

Beyond Referring Expressions: Scenario Comprehension Visual Grounding

Add code
Apr 02, 2026
Viaarxiv icon

Batched Contextual Reinforcement: A Task-Scaling Law for Efficient Reasoning

Add code
Apr 02, 2026
Viaarxiv icon

A Simple Baseline for Streaming Video Understanding

Add code
Apr 02, 2026
Viaarxiv icon

Beyond the Assistant Turn: User Turn Generation as a Probe of Interaction Awareness in Language Models

Add code
Apr 02, 2026
Viaarxiv icon

Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing

Add code
Apr 02, 2026
Viaarxiv icon

Novel Memory Forgetting Techniques for Autonomous AI Agents: Balancing Relevance and Efficiency

Add code
Apr 02, 2026
Viaarxiv icon