Performer


FMPose3D: monocular 3D pose estimation via flow matching

Add code
Feb 05, 2026
Viaarxiv icon

SwimBird: Eliciting Switchable Reasoning Mode in Hybrid Autoregressive MLLMs

Add code
Feb 05, 2026
Viaarxiv icon

VRIQ: Benchmarking and Analyzing Visual-Reasoning IQ of VLMs

Add code
Feb 05, 2026
Viaarxiv icon

A Comparative Study of 3D Person Detection: Sensor Modalities and Robustness in Diverse Indoor and Outdoor Environments

Add code
Feb 05, 2026
Viaarxiv icon

Can vision language models learn intuitive physics from interaction?

Add code
Feb 05, 2026
Viaarxiv icon

Variable Search Stepsize for Randomized Local Search in Multi-Objective Combinatorial Optimization

Add code
Feb 05, 2026
Viaarxiv icon

Breaking Symmetry Bottlenecks in GNN Readouts

Add code
Feb 05, 2026
Viaarxiv icon

Bifrost: Steering Strategic Trajectories to Bridge Contextual Gaps for Self-Improving Agents

Add code
Feb 05, 2026
Viaarxiv icon

V-Retrver: Evidence-Driven Agentic Reasoning for Universal Multimodal Retrieval

Add code
Feb 05, 2026
Viaarxiv icon

Visuo-Tactile World Models

Add code
Feb 05, 2026
Viaarxiv icon