Picture for Mohit Bansal

Mohit Bansal

Shammie

MERRIN: A Benchmark for Multimodal Evidence Retrieval and Reasoning in Noisy Web Environments

Add code
Apr 15, 2026
Viaarxiv icon

Playing Along: Learning a Double-Agent Defender for Belief Steering via Theory of Mind

Add code
Apr 13, 2026
Viaarxiv icon

The Master Key Hypothesis: Unlocking Cross-Model Capability Transfer via Linear Subspace Alignment

Add code
Apr 07, 2026
Viaarxiv icon

Cog-DRIFT: Exploration on Adaptively Reformulated Instances Enables Learning from Hard Reasoning Problems

Add code
Apr 06, 2026
Viaarxiv icon

Ego2Web: A Web Agent Benchmark Grounded in Egocentric Videos

Add code
Mar 23, 2026
Viaarxiv icon

V-Co: A Closer Look at Visual Representation Alignment via Co-Denoising

Add code
Mar 17, 2026
Viaarxiv icon

VisionCoach: Reinforcing Grounded Video Reasoning via Visual-Perception Prompting

Add code
Mar 15, 2026
Viaarxiv icon

Does AI See like Art Historians? Interpreting How Vision Language Models Recognize Artistic Style

Add code
Mar 11, 2026
Viaarxiv icon

Balancing Faithfulness and Performance in Reasoning via Multi-Listener Soft Execution

Add code
Feb 18, 2026
Viaarxiv icon

AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories

Add code
Feb 16, 2026
Viaarxiv icon