Picture for Xuming Hu

Xuming Hu

Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey

Add code
Dec 03, 2024
Figure 1 for Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Figure 2 for Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Figure 3 for Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Figure 4 for Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Viaarxiv icon

Learning Robust Anymodal Segmentor with Unimodal and Cross-modal Distillation

Add code
Nov 26, 2024
Viaarxiv icon

SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context

Add code
Nov 25, 2024
Viaarxiv icon

ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models

Add code
Nov 22, 2024
Viaarxiv icon

Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios

Add code
Nov 05, 2024
Figure 1 for Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios
Figure 2 for Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios
Figure 3 for Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios
Figure 4 for Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios
Viaarxiv icon

Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models

Add code
Oct 29, 2024
Viaarxiv icon

NeuGPT: Unified multi-modal Neural GPT

Add code
Oct 28, 2024
Viaarxiv icon

Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality

Add code
Oct 07, 2024
Figure 1 for Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality
Figure 2 for Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality
Figure 3 for Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality
Figure 4 for Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality
Viaarxiv icon

MINER: Mining the Underlying Pattern of Modality-Specific Neurons in Multimodal Large Language Models

Add code
Oct 07, 2024
Figure 1 for MINER: Mining the Underlying Pattern of Modality-Specific Neurons in Multimodal Large Language Models
Figure 2 for MINER: Mining the Underlying Pattern of Modality-Specific Neurons in Multimodal Large Language Models
Figure 3 for MINER: Mining the Underlying Pattern of Modality-Specific Neurons in Multimodal Large Language Models
Figure 4 for MINER: Mining the Underlying Pattern of Modality-Specific Neurons in Multimodal Large Language Models
Viaarxiv icon

ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection

Add code
Oct 06, 2024
Figure 1 for ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection
Figure 2 for ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection
Figure 3 for ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection
Figure 4 for ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection
Viaarxiv icon