Picture for Bolin Lai

Bolin Lai

MM-SpuBench: Towards Better Understanding of Spurious Biases in Multimodal LLMs

Add code
Jun 24, 2024
Viaarxiv icon

What is the Visual Cognition Gap between Humans and Multimodal LLMs?

Add code
Jun 14, 2024
Viaarxiv icon

Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations

Add code
Mar 04, 2024
Figure 1 for Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations
Figure 2 for Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations
Figure 3 for Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations
Figure 4 for Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations
Viaarxiv icon

Learning-based Bone Quality Classification Method for Spinal Metastasis

Add code
Feb 14, 2024
Viaarxiv icon

Weakly Supervised Segmentation of Vertebral Bodies with Iterative Slice-propagation

Add code
Feb 14, 2024
Viaarxiv icon

LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning

Add code
Dec 06, 2023
Figure 1 for LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning
Figure 2 for LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning
Figure 3 for LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning
Figure 4 for LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning
Viaarxiv icon

Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation

Add code
May 06, 2023
Figure 1 for Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation
Figure 2 for Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation
Figure 3 for Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation
Figure 4 for Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation
Viaarxiv icon

Werewolf Among Us: A Multimodal Dataset for Modeling Persuasion Behaviors in Social Deduction Games

Add code
Dec 16, 2022
Figure 1 for Werewolf Among Us: A Multimodal Dataset for Modeling Persuasion Behaviors in Social Deduction Games
Figure 2 for Werewolf Among Us: A Multimodal Dataset for Modeling Persuasion Behaviors in Social Deduction Games
Figure 3 for Werewolf Among Us: A Multimodal Dataset for Modeling Persuasion Behaviors in Social Deduction Games
Figure 4 for Werewolf Among Us: A Multimodal Dataset for Modeling Persuasion Behaviors in Social Deduction Games
Viaarxiv icon

In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation

Add code
Aug 10, 2022
Figure 1 for In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation
Figure 2 for In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation
Figure 3 for In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation
Figure 4 for In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation
Viaarxiv icon

A deep learning pipeline for localization, differentiation, and uncertainty estimation of liver lesions using multi-phasic and multi-sequence MRI

Add code
Oct 17, 2021
Figure 1 for A deep learning pipeline for localization, differentiation, and uncertainty estimation of liver lesions using multi-phasic and multi-sequence MRI
Figure 2 for A deep learning pipeline for localization, differentiation, and uncertainty estimation of liver lesions using multi-phasic and multi-sequence MRI
Figure 3 for A deep learning pipeline for localization, differentiation, and uncertainty estimation of liver lesions using multi-phasic and multi-sequence MRI
Figure 4 for A deep learning pipeline for localization, differentiation, and uncertainty estimation of liver lesions using multi-phasic and multi-sequence MRI
Viaarxiv icon