Picture for Dinesh Manocha

Dinesh Manocha

CalibFree: Self-Supervised View Feature Separation for Calibration-Free Multi-Camera Multi-Object Tracking

Add code
May 10, 2026
Viaarxiv icon

DRAGON: A Benchmark for Evidence-Grounded Visual Reasoning over Diagrams

Add code
Apr 28, 2026
Viaarxiv icon

Learning Illumination Control in Diffusion Models

Add code
Apr 27, 2026
Viaarxiv icon

Exploring Audio Hallucination in Egocentric Video Understanding

Add code
Apr 26, 2026
Viaarxiv icon

Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks

Add code
Apr 22, 2026
Viaarxiv icon

Video-Robin: Autoregressive Diffusion Planning for Intent-Grounded Video-to-Music Generation

Add code
Apr 19, 2026
Viaarxiv icon

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

Add code
Apr 13, 2026
Viaarxiv icon

What Drives Representation Steering? A Mechanistic Case Study on Steering Refusal

Add code
Apr 09, 2026
Viaarxiv icon

Do Audio-Visual Large Language Models Really See and Hear?

Add code
Apr 03, 2026
Viaarxiv icon

SABER: A Stealthy Agentic Black-Box Attack Framework for Vision-Language-Action Models

Add code
Mar 26, 2026
Viaarxiv icon