Picture for Subhashree Radhakrishnan

Subhashree Radhakrishnan

4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation

Add code
Dec 22, 2025
Figure 1 for 4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
Figure 2 for 4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
Figure 3 for 4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
Figure 4 for 4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
Viaarxiv icon

NVIDIA Nemotron Nano V2 VL

Add code
Nov 07, 2025
Viaarxiv icon

3D Aware Region Prompted Vision Language Model

Add code
Sep 16, 2025
Figure 1 for 3D Aware Region Prompted Vision Language Model
Figure 2 for 3D Aware Region Prompted Vision Language Model
Figure 3 for 3D Aware Region Prompted Vision Language Model
Figure 4 for 3D Aware Region Prompted Vision Language Model
Viaarxiv icon

FRAG: Frame Selection Augmented Generation for Long Video and Long Document Understanding

Add code
Apr 24, 2025
Figure 1 for FRAG: Frame Selection Augmented Generation for Long Video and Long Document Understanding
Figure 2 for FRAG: Frame Selection Augmented Generation for Long Video and Long Document Understanding
Figure 3 for FRAG: Frame Selection Augmented Generation for Long Video and Long Document Understanding
Figure 4 for FRAG: Frame Selection Augmented Generation for Long Video and Long Document Understanding
Viaarxiv icon

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Add code
Jan 14, 2025
Viaarxiv icon

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Add code
Aug 28, 2024
Figure 1 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Figure 2 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Figure 3 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Figure 4 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Viaarxiv icon

What is Point Supervision Worth in Video Instance Segmentation?

Add code
Apr 01, 2024
Figure 1 for What is Point Supervision Worth in Video Instance Segmentation?
Figure 2 for What is Point Supervision Worth in Video Instance Segmentation?
Figure 3 for What is Point Supervision Worth in Video Instance Segmentation?
Figure 4 for What is Point Supervision Worth in Video Instance Segmentation?
Viaarxiv icon

LITA: Language Instructed Temporal-Localization Assistant

Add code
Mar 27, 2024
Figure 1 for LITA: Language Instructed Temporal-Localization Assistant
Figure 2 for LITA: Language Instructed Temporal-Localization Assistant
Figure 3 for LITA: Language Instructed Temporal-Localization Assistant
Figure 4 for LITA: Language Instructed Temporal-Localization Assistant
Viaarxiv icon

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

Add code
Jun 05, 2021
Figure 1 for DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision
Figure 2 for DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision
Figure 3 for DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision
Figure 4 for DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision
Viaarxiv icon