Alert button
Picture for Di Hu

Di Hu

Alert button

Robust Cross-Modal Knowledge Distillation for Unconstrained Videos

Apr 16, 2023
Wenke Xia, Xingjian Li, Andong Deng, Haoyi Xiong, Dejing Dou, Di Hu

Figure 1 for Robust Cross-Modal Knowledge Distillation for Unconstrained Videos
Figure 2 for Robust Cross-Modal Knowledge Distillation for Unconstrained Videos
Figure 3 for Robust Cross-Modal Knowledge Distillation for Unconstrained Videos
Figure 4 for Robust Cross-Modal Knowledge Distillation for Unconstrained Videos
Viaarxiv icon

MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning

Mar 11, 2023
Ruize Xu, Ruoxuan Feng, Shi-Xiong Zhang, Di Hu

Figure 1 for MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Figure 2 for MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Figure 3 for MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Figure 4 for MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Viaarxiv icon

Revisiting Pre-training in Audio-Visual Learning

Feb 17, 2023
Ruoxuan Feng, Wenke Xia, Di Hu

Figure 1 for Revisiting Pre-training in Audio-Visual Learning
Figure 2 for Revisiting Pre-training in Audio-Visual Learning
Figure 3 for Revisiting Pre-training in Audio-Visual Learning
Figure 4 for Revisiting Pre-training in Audio-Visual Learning
Viaarxiv icon

Balanced Audiovisual Dataset for Imbalance Analysis

Feb 14, 2023
Wenke Xia, Xu Zhao, Xincheng Pang, Changqing Zhang, Di Hu

Figure 1 for Balanced Audiovisual Dataset for Imbalance Analysis
Figure 2 for Balanced Audiovisual Dataset for Imbalance Analysis
Figure 3 for Balanced Audiovisual Dataset for Imbalance Analysis
Figure 4 for Balanced Audiovisual Dataset for Imbalance Analysis
Viaarxiv icon

TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat

Jan 14, 2023
Hongpeng Lin, Ludan Ruan, Wenke Xia, Peiyu Liu, Jingyuan Wen, Yixin Xu, Di Hu, Ruihua Song, Wayne Xin Zhao, Qin Jin, Zhiwu Lu

Figure 1 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat
Figure 2 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat
Figure 3 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat
Figure 4 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat
Viaarxiv icon

Learning in Audio-visual Context: A Review, Analysis, and New Perspective

Aug 20, 2022
Yake Wei, Di Hu, Yapeng Tian, Xuelong Li

Figure 1 for Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Figure 2 for Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Figure 3 for Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Figure 4 for Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Viaarxiv icon

Dual Domain-Adversarial Learning for Audio-Visual Saliency Prediction

Aug 16, 2022
Yingzi Fan, Longfei Han, Yue Zhang, Lechao Cheng, Chen Xia, Di Hu

Figure 1 for Dual Domain-Adversarial Learning for Audio-Visual Saliency Prediction
Figure 2 for Dual Domain-Adversarial Learning for Audio-Visual Saliency Prediction
Figure 3 for Dual Domain-Adversarial Learning for Audio-Visual Saliency Prediction
Figure 4 for Dual Domain-Adversarial Learning for Audio-Visual Saliency Prediction
Viaarxiv icon

Learning to Answer Questions in Dynamic Audio-Visual Scenarios

Apr 05, 2022
Guangyao Li, Yake Wei, Yapeng Tian, Chenliang Xu, Ji-Rong Wen, Di Hu

Figure 1 for Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Figure 2 for Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Figure 3 for Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Figure 4 for Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Viaarxiv icon

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Mar 29, 2022
Xiaokang Peng, Yake Wei, Andong Deng, Dong Wang, Di Hu

Figure 1 for Balanced Multimodal Learning via On-the-fly Gradient Modulation
Figure 2 for Balanced Multimodal Learning via On-the-fly Gradient Modulation
Figure 3 for Balanced Multimodal Learning via On-the-fly Gradient Modulation
Figure 4 for Balanced Multimodal Learning via On-the-fly Gradient Modulation
Viaarxiv icon