Picture for Arijit Ray

Arijit Ray

GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation

Add code
May 19, 2025
Viaarxiv icon

SAT: Spatial Aptitude Training for Multimodal Language Models

Add code
Dec 10, 2024
Figure 1 for SAT: Spatial Aptitude Training for Multimodal Language Models
Figure 2 for SAT: Spatial Aptitude Training for Multimodal Language Models
Figure 3 for SAT: Spatial Aptitude Training for Multimodal Language Models
Figure 4 for SAT: Spatial Aptitude Training for Multimodal Language Models
Viaarxiv icon

BloomVQA: Assessing Hierarchical Multi-modal Comprehension

Add code
Dec 20, 2023
Viaarxiv icon

Lasagna: Layered Score Distillation for Disentangled Object Relighting

Add code
Nov 30, 2023
Figure 1 for Lasagna: Layered Score Distillation for Disentangled Object Relighting
Figure 2 for Lasagna: Layered Score Distillation for Disentangled Object Relighting
Figure 3 for Lasagna: Layered Score Distillation for Disentangled Object Relighting
Figure 4 for Lasagna: Layered Score Distillation for Disentangled Object Relighting
Viaarxiv icon

Socratis: Are large multimodal models emotionally aware?

Add code
Sep 05, 2023
Figure 1 for Socratis: Are large multimodal models emotionally aware?
Figure 2 for Socratis: Are large multimodal models emotionally aware?
Figure 3 for Socratis: Are large multimodal models emotionally aware?
Figure 4 for Socratis: Are large multimodal models emotionally aware?
Viaarxiv icon

COLA: How to adapt vision-language models to Compose Objects Localized with Attributes?

Add code
May 05, 2023
Viaarxiv icon

Language-Guided Audio-Visual Source Separation via Trimodal Consistency

Add code
Mar 28, 2023
Figure 1 for Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Figure 2 for Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Figure 3 for Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Figure 4 for Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Viaarxiv icon

Improving Users' Mental Model with Attention-directed Counterfactual Edits

Add code
Oct 15, 2021
Figure 1 for Improving Users' Mental Model with Attention-directed Counterfactual Edits
Figure 2 for Improving Users' Mental Model with Attention-directed Counterfactual Edits
Figure 3 for Improving Users' Mental Model with Attention-directed Counterfactual Edits
Figure 4 for Improving Users' Mental Model with Attention-directed Counterfactual Edits
Viaarxiv icon

Knowing What VQA Does Not: Pointing to Error-Inducing Regions to Improve Explanation Helpfulness

Add code
Mar 31, 2021
Figure 1 for Knowing What VQA Does Not: Pointing to Error-Inducing Regions to Improve Explanation Helpfulness
Figure 2 for Knowing What VQA Does Not: Pointing to Error-Inducing Regions to Improve Explanation Helpfulness
Figure 3 for Knowing What VQA Does Not: Pointing to Error-Inducing Regions to Improve Explanation Helpfulness
Figure 4 for Knowing What VQA Does Not: Pointing to Error-Inducing Regions to Improve Explanation Helpfulness
Viaarxiv icon

The Impact of Explanations on AI Competency Prediction in VQA

Add code
Jul 02, 2020
Figure 1 for The Impact of Explanations on AI Competency Prediction in VQA
Figure 2 for The Impact of Explanations on AI Competency Prediction in VQA
Figure 3 for The Impact of Explanations on AI Competency Prediction in VQA
Figure 4 for The Impact of Explanations on AI Competency Prediction in VQA
Viaarxiv icon