Picture for Mohit Bansal

Mohit Bansal

Shammie

Cog-DRIFT: Exploration on Adaptively Reformulated Instances Enables Learning from Hard Reasoning Problems

Add code
Apr 06, 2026
Viaarxiv icon

Ego2Web: A Web Agent Benchmark Grounded in Egocentric Videos

Add code
Mar 23, 2026
Viaarxiv icon

V-Co: A Closer Look at Visual Representation Alignment via Co-Denoising

Add code
Mar 17, 2026
Viaarxiv icon

VisionCoach: Reinforcing Grounded Video Reasoning via Visual-Perception Prompting

Add code
Mar 15, 2026
Viaarxiv icon

Does AI See like Art Historians? Interpreting How Vision Language Models Recognize Artistic Style

Add code
Mar 11, 2026
Viaarxiv icon

Balancing Faithfulness and Performance in Reasoning via Multi-Listener Soft Execution

Add code
Feb 18, 2026
Viaarxiv icon

AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories

Add code
Feb 16, 2026
Viaarxiv icon

Multimodal Fact-Level Attribution for Verifiable Reasoning

Add code
Feb 12, 2026
Viaarxiv icon

Effective Reasoning Chains Reduce Intrinsic Dimensionality

Add code
Feb 09, 2026
Viaarxiv icon

When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning

Add code
Feb 09, 2026
Viaarxiv icon