Alert button
Picture for Yufei Zha

Yufei Zha

Alert button

DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction

Add code
Bookmark button
Alert button
Mar 02, 2024
Junwen Xiong, Peng Zhang, Tao You, Chuanyue Li, Wei Huang, Yufei Zha

Figure 1 for DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
Figure 2 for DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
Figure 3 for DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
Figure 4 for DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
Viaarxiv icon

UniST: Towards Unifying Saliency Transformer for Video Saliency Prediction and Detection

Add code
Bookmark button
Alert button
Sep 15, 2023
Junwen Xiong, Peng Zhang, Chuanyue Li, Wei Huang, Yufei Zha, Tao You

Figure 1 for UniST: Towards Unifying Saliency Transformer for Video Saliency Prediction and Detection
Figure 2 for UniST: Towards Unifying Saliency Transformer for Video Saliency Prediction and Detection
Figure 3 for UniST: Towards Unifying Saliency Transformer for Video Saliency Prediction and Detection
Figure 4 for UniST: Towards Unifying Saliency Transformer for Video Saliency Prediction and Detection
Viaarxiv icon

Induction Network: Audio-Visual Modality Gap-Bridging for Self-Supervised Sound Source Localization

Add code
Bookmark button
Alert button
Aug 09, 2023
Tianyu Liu, Peng Zhang, Wei Huang, Yufei Zha, Tao You, Yanning Zhang

Figure 1 for Induction Network: Audio-Visual Modality Gap-Bridging for Self-Supervised Sound Source Localization
Figure 2 for Induction Network: Audio-Visual Modality Gap-Bridging for Self-Supervised Sound Source Localization
Figure 3 for Induction Network: Audio-Visual Modality Gap-Bridging for Self-Supervised Sound Source Localization
Figure 4 for Induction Network: Audio-Visual Modality Gap-Bridging for Self-Supervised Sound Source Localization
Viaarxiv icon

FTFDNet: Learning to Detect Talking Face Video Manipulation with Tri-Modality Interaction

Add code
Bookmark button
Alert button
Jul 08, 2023
Ganglai Wang, Peng Zhang, Junwen Xiong, Feihan Yang, Wei Huang, Yufei Zha

Figure 1 for FTFDNet: Learning to Detect Talking Face Video Manipulation with Tri-Modality Interaction
Figure 2 for FTFDNet: Learning to Detect Talking Face Video Manipulation with Tri-Modality Interaction
Figure 3 for FTFDNet: Learning to Detect Talking Face Video Manipulation with Tri-Modality Interaction
Figure 4 for FTFDNet: Learning to Detect Talking Face Video Manipulation with Tri-Modality Interaction
Viaarxiv icon

CASP-Net: Rethinking Video Saliency Prediction from an Audio-VisualConsistency Perceptual Perspective

Add code
Bookmark button
Alert button
Mar 11, 2023
Junwen Xiong, Ganglai Wang, Peng Zhang, Wei Huang, Yufei Zha, Guangtao Zhai

Figure 1 for CASP-Net: Rethinking Video Saliency Prediction from an Audio-VisualConsistency Perceptual Perspective
Figure 2 for CASP-Net: Rethinking Video Saliency Prediction from an Audio-VisualConsistency Perceptual Perspective
Figure 3 for CASP-Net: Rethinking Video Saliency Prediction from an Audio-VisualConsistency Perceptual Perspective
Figure 4 for CASP-Net: Rethinking Video Saliency Prediction from an Audio-VisualConsistency Perceptual Perspective
Viaarxiv icon

An Audio-Visual Attention Based Multimodal Network for Fake Talking Face Videos Detection

Add code
Bookmark button
Alert button
Mar 10, 2022
Ganglai Wang, Peng Zhang, Lei Xie, Wei Huang, Yufei Zha, Yanning Zhang

Figure 1 for An Audio-Visual Attention Based Multimodal Network for Fake Talking Face Videos Detection
Figure 2 for An Audio-Visual Attention Based Multimodal Network for Fake Talking Face Videos Detection
Figure 3 for An Audio-Visual Attention Based Multimodal Network for Fake Talking Face Videos Detection
Figure 4 for An Audio-Visual Attention Based Multimodal Network for Fake Talking Face Videos Detection
Viaarxiv icon

Attention-Based Lip Audio-Visual Synthesis for Talking Face Generation in the Wild

Add code
Bookmark button
Alert button
Mar 08, 2022
Ganglai Wang, Peng Zhang, Lei Xie, Wei Huang, Yufei Zha

Figure 1 for Attention-Based Lip Audio-Visual Synthesis for Talking Face Generation in the Wild
Figure 2 for Attention-Based Lip Audio-Visual Synthesis for Talking Face Generation in the Wild
Figure 3 for Attention-Based Lip Audio-Visual Synthesis for Talking Face Generation in the Wild
Figure 4 for Attention-Based Lip Audio-Visual Synthesis for Talking Face Generation in the Wild
Viaarxiv icon

Audio-visual speech separation based on joint feature representation with cross-modal attention

Add code
Bookmark button
Alert button
Mar 05, 2022
Junwen Xiong, Peng Zhang, Lei Xie, Wei Huang, Yufei Zha, Yanning Zhang

Figure 1 for Audio-visual speech separation based on joint feature representation with cross-modal attention
Figure 2 for Audio-visual speech separation based on joint feature representation with cross-modal attention
Figure 3 for Audio-visual speech separation based on joint feature representation with cross-modal attention
Figure 4 for Audio-visual speech separation based on joint feature representation with cross-modal attention
Viaarxiv icon

Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement

Add code
Bookmark button
Alert button
Mar 04, 2022
Junwen Xiong, Yu Zhou, Peng Zhang, Lei Xie, Wei Huang, Yufei Zha

Figure 1 for Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement
Figure 2 for Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement
Figure 3 for Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement
Figure 4 for Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement
Viaarxiv icon

Unsupervised Cross-Modal Distillation for Thermal Infrared Tracking

Add code
Bookmark button
Alert button
Jul 31, 2021
Jingxian Sun, Lichao Zhang, Yufei Zha, Abel Gonzalez-Garcia, Peng Zhang, Wei Huang, Yanning Zhang

Figure 1 for Unsupervised Cross-Modal Distillation for Thermal Infrared Tracking
Figure 2 for Unsupervised Cross-Modal Distillation for Thermal Infrared Tracking
Figure 3 for Unsupervised Cross-Modal Distillation for Thermal Infrared Tracking
Figure 4 for Unsupervised Cross-Modal Distillation for Thermal Infrared Tracking
Viaarxiv icon