Picture for Sayan Nag

Sayan Nag

Object-WIPER : Training-Free Object and Associated Effect Removal in Videos

Add code
Jan 10, 2026
Viaarxiv icon

SciFig: Towards Automating Scientific Figure Generation

Add code
Jan 07, 2026
Viaarxiv icon

SliderEdit: Continuous Image Editing with Fine-Grained Instruction Control

Add code
Nov 12, 2025
Viaarxiv icon

AURA: A Fine-Grained Benchmark and Decomposed Metric for Audio-Visual Reasoning

Add code
Aug 10, 2025
Viaarxiv icon

MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks

Add code
Jun 08, 2025
Viaarxiv icon

Localizing Knowledge in Diffusion Transformers

Add code
May 24, 2025
Viaarxiv icon

Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs

Add code
Mar 29, 2025
Viaarxiv icon

AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs

Add code
Jan 03, 2025
Figure 1 for AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
Figure 2 for AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
Figure 3 for AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
Figure 4 for AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
Viaarxiv icon

SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation

Add code
Jul 02, 2024
Figure 1 for SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
Figure 2 for SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
Figure 3 for SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
Figure 4 for SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
Viaarxiv icon

Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time

Add code
Jul 01, 2024
Viaarxiv icon