Alert button
Picture for Zehan Wang

Zehan Wang

Alert button

TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation

Add code
Bookmark button
Alert button
Dec 23, 2023
Xize Cheng, Rongjie Huang, Linjun Li, Tao Jin, Zehan Wang, Aoxiong Yin, Minglei Li, Xinyu Duan, changpeng yang, Zhou Zhao

Viaarxiv icon

Multi-Modal Domain Adaptation Across Video Scenes for Temporal Video Grounding

Add code
Bookmark button
Alert button
Dec 21, 2023
Haifeng Huang, Yang Zhao, Zehan Wang, Yan Xia, Zhou Zhao

Viaarxiv icon

Chat-3D v2: Bridging 3D Scene and Large Language Models with Object Identifiers

Add code
Bookmark button
Alert button
Dec 15, 2023
Haifeng Huang, Zehan Wang, Rongjie Huang, Luping Liu, Xize Cheng, Yang Zhao, Tao Jin, Zhou Zhao

Figure 1 for Chat-3D v2: Bridging 3D Scene and Large Language Models with Object Identifiers
Figure 2 for Chat-3D v2: Bridging 3D Scene and Large Language Models with Object Identifiers
Figure 3 for Chat-3D v2: Bridging 3D Scene and Large Language Models with Object Identifiers
Figure 4 for Chat-3D v2: Bridging 3D Scene and Large Language Models with Object Identifiers
Viaarxiv icon

Extending Multi-modal Contrastive Representations

Add code
Bookmark button
Alert button
Oct 13, 2023
Zehan Wang, Ziang Zhang, Luping Liu, Yang Zhao, Haifeng Huang, Tao Jin, Zhou Zhao

Figure 1 for Extending Multi-modal Contrastive Representations
Figure 2 for Extending Multi-modal Contrastive Representations
Figure 3 for Extending Multi-modal Contrastive Representations
Figure 4 for Extending Multi-modal Contrastive Representations
Viaarxiv icon

Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes

Add code
Bookmark button
Alert button
Aug 17, 2023
Zehan Wang, Haifeng Huang, Yang Zhao, Ziang Zhang, Zhou Zhao

Figure 1 for Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes
Figure 2 for Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes
Figure 3 for Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes
Figure 4 for Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes
Viaarxiv icon

3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding

Add code
Bookmark button
Alert button
Jul 25, 2023
Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao

Figure 1 for 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
Figure 2 for 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
Figure 3 for 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
Figure 4 for 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
Viaarxiv icon

Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding

Add code
Bookmark button
Alert button
Jul 18, 2023
Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao

Figure 1 for Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
Figure 2 for Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
Figure 3 for Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
Figure 4 for Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
Viaarxiv icon

Connecting Multi-modal Contrastive Representations

Add code
Bookmark button
Alert button
May 22, 2023
Zehan Wang, Yang Zhao, Xize Cheng, Haifeng Huang, Jiageng Liu, Li Tang, Linjun Li, Yongqi Wang, Aoxiong Yin, Ziang Zhang, Zhou Zhao

Figure 1 for Connecting Multi-modal Contrastive Representations
Figure 2 for Connecting Multi-modal Contrastive Representations
Figure 3 for Connecting Multi-modal Contrastive Representations
Figure 4 for Connecting Multi-modal Contrastive Representations
Viaarxiv icon

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition

Add code
Bookmark button
Alert button
Mar 09, 2023
Xize Cheng, Linjun Li, Tao Jin, Rongjie Huang, Wang Lin, Zehan Wang, Huangdai Liu, Ye Wang, Aoxiong Yin, Zhou Zhao

Figure 1 for MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition
Figure 2 for MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition
Figure 3 for MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition
Figure 4 for MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition
Viaarxiv icon

Frame Interpolation with Multi-Scale Deep Loss Functions and Generative Adversarial Networks

Add code
Bookmark button
Alert button
Nov 16, 2017
Joost van Amersfoort, Wenzhe Shi, Alejandro Acosta, Francisco Massa, Johannes Totz, Zehan Wang, Jose Caballero

Figure 1 for Frame Interpolation with Multi-Scale Deep Loss Functions and Generative Adversarial Networks
Figure 2 for Frame Interpolation with Multi-Scale Deep Loss Functions and Generative Adversarial Networks
Figure 3 for Frame Interpolation with Multi-Scale Deep Loss Functions and Generative Adversarial Networks
Figure 4 for Frame Interpolation with Multi-Scale Deep Loss Functions and Generative Adversarial Networks
Viaarxiv icon