Picture for Aoxiong Yin

Aoxiong Yin

T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text

Add code
Jun 11, 2024
Figure 1 for T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text
Figure 2 for T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text
Figure 3 for T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text
Figure 4 for T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text
Viaarxiv icon

TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation

Add code
Dec 23, 2023
Viaarxiv icon

Language Model is a Branch Predictor for Simultaneous Machine Translation

Add code
Dec 22, 2023
Viaarxiv icon

TrainerAgent: Customizable and Efficient Model Training through LLM-Powered Multi-Agent System

Add code
Nov 23, 2023
Viaarxiv icon

3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding

Add code
Jul 25, 2023
Figure 1 for 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
Figure 2 for 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
Figure 3 for 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
Figure 4 for 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
Viaarxiv icon

Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding

Add code
Jul 18, 2023
Figure 1 for Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
Figure 2 for Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
Figure 3 for Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
Figure 4 for Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
Viaarxiv icon

Gloss Attention for Gloss-free Sign Language Translation

Add code
Jul 14, 2023
Figure 1 for Gloss Attention for Gloss-free Sign Language Translation
Figure 2 for Gloss Attention for Gloss-free Sign Language Translation
Figure 3 for Gloss Attention for Gloss-free Sign Language Translation
Figure 4 for Gloss Attention for Gloss-free Sign Language Translation
Viaarxiv icon

Connecting Multi-modal Contrastive Representations

Add code
May 22, 2023
Figure 1 for Connecting Multi-modal Contrastive Representations
Figure 2 for Connecting Multi-modal Contrastive Representations
Figure 3 for Connecting Multi-modal Contrastive Representations
Figure 4 for Connecting Multi-modal Contrastive Representations
Viaarxiv icon

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition

Add code
Mar 09, 2023
Figure 1 for MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition
Figure 2 for MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition
Figure 3 for MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition
Figure 4 for MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition
Viaarxiv icon

SimulSLT: End-to-End Simultaneous Sign Language Translation

Add code
Dec 08, 2021
Figure 1 for SimulSLT: End-to-End Simultaneous Sign Language Translation
Figure 2 for SimulSLT: End-to-End Simultaneous Sign Language Translation
Figure 3 for SimulSLT: End-to-End Simultaneous Sign Language Translation
Figure 4 for SimulSLT: End-to-End Simultaneous Sign Language Translation
Viaarxiv icon