Alert button
Picture for Yuexian Zou

Yuexian Zou

Alert button

Aligning Source Visual and Target Language Domains for Unpaired Video Captioning

Add code
Bookmark button
Alert button
Nov 22, 2022
Fenglin Liu, Xian Wu, Chenyu You, Shen Ge, Yuexian Zou, Xu Sun

Figure 1 for Aligning Source Visual and Target Language Domains for Unpaired Video Captioning
Figure 2 for Aligning Source Visual and Target Language Domains for Unpaired Video Captioning
Figure 3 for Aligning Source Visual and Target Language Domains for Unpaired Video Captioning
Figure 4 for Aligning Source Visual and Target Language Domains for Unpaired Video Captioning
Viaarxiv icon

A Dynamic Graph Interactive Framework with Label-Semantic Injection for Spoken Language Understanding

Add code
Bookmark button
Alert button
Nov 08, 2022
Zhihong Zhu, Weiyuan Xu, Xuxin Cheng, Tengtao Song, Yuexian Zou

Figure 1 for A Dynamic Graph Interactive Framework with Label-Semantic Injection for Spoken Language Understanding
Figure 2 for A Dynamic Graph Interactive Framework with Label-Semantic Injection for Spoken Language Understanding
Figure 3 for A Dynamic Graph Interactive Framework with Label-Semantic Injection for Spoken Language Understanding
Figure 4 for A Dynamic Graph Interactive Framework with Label-Semantic Injection for Spoken Language Understanding
Viaarxiv icon

NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS

Add code
Bookmark button
Alert button
Nov 04, 2022
Dongchao Yang, Songxiang Liu, Jianwei Yu, Helin Wang, Chao Weng, Yuexian Zou

Figure 1 for NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS
Figure 2 for NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS
Figure 3 for NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS
Viaarxiv icon

DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention

Add code
Bookmark button
Alert button
Oct 28, 2022
Fenglin Liu, Xian Wu, Shen Ge, Xuancheng Ren, Wei Fan, Xu Sun, Yuexian Zou

Figure 1 for DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention
Figure 2 for DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention
Figure 3 for DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention
Figure 4 for DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention
Viaarxiv icon

Video Referring Expression Comprehension via Transformer with Content-aware Query

Add code
Bookmark button
Alert button
Oct 06, 2022
Ji Jiang, Meng Cao, Tengtao Song, Yuexian Zou

Figure 1 for Video Referring Expression Comprehension via Transformer with Content-aware Query
Figure 2 for Video Referring Expression Comprehension via Transformer with Content-aware Query
Figure 3 for Video Referring Expression Comprehension via Transformer with Content-aware Query
Figure 4 for Video Referring Expression Comprehension via Transformer with Content-aware Query
Viaarxiv icon

Correspondence Matters for Video Referring Expression Comprehension

Add code
Bookmark button
Alert button
Jul 21, 2022
Meng Cao, Ji Jiang, Long Chen, Yuexian Zou

Figure 1 for Correspondence Matters for Video Referring Expression Comprehension
Figure 2 for Correspondence Matters for Video Referring Expression Comprehension
Figure 3 for Correspondence Matters for Video Referring Expression Comprehension
Figure 4 for Correspondence Matters for Video Referring Expression Comprehension
Viaarxiv icon

LocVTP: Video-Text Pre-training for Temporal Localization

Add code
Bookmark button
Alert button
Jul 21, 2022
Meng Cao, Tianyu Yang, Junwu Weng, Can Zhang, Jue Wang, Yuexian Zou

Figure 1 for LocVTP: Video-Text Pre-training for Temporal Localization
Figure 2 for LocVTP: Video-Text Pre-training for Temporal Localization
Figure 3 for LocVTP: Video-Text Pre-training for Temporal Localization
Figure 4 for LocVTP: Video-Text Pre-training for Temporal Localization
Viaarxiv icon

Diffsound: Discrete Diffusion Model for Text-to-sound Generation

Add code
Bookmark button
Alert button
Jul 20, 2022
Dongchao Yang, Jianwei Yu, Helin Wang, Wen Wang, Chao Weng, Yuexian Zou, Dong Yu

Figure 1 for Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Figure 2 for Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Figure 3 for Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Figure 4 for Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Viaarxiv icon

LAE: Language-Aware Encoder for Monolingual and Multilingual ASR

Add code
Bookmark button
Alert button
Jun 05, 2022
Jinchuan Tian, Jianwei Yu, Chunlei Zhang, Chao Weng, Yuexian Zou, Dong Yu

Figure 1 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Figure 2 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Figure 3 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Figure 4 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Viaarxiv icon

Improving Dual-Microphone Speech Enhancement by Learning Cross-Channel Features with Multi-Head Attention

Add code
Bookmark button
Alert button
May 03, 2022
Xinmeng Xu, Rongzhi Gu, Yuexian Zou

Figure 1 for Improving Dual-Microphone Speech Enhancement by Learning Cross-Channel Features with Multi-Head Attention
Figure 2 for Improving Dual-Microphone Speech Enhancement by Learning Cross-Channel Features with Multi-Head Attention
Figure 3 for Improving Dual-Microphone Speech Enhancement by Learning Cross-Channel Features with Multi-Head Attention
Figure 4 for Improving Dual-Microphone Speech Enhancement by Learning Cross-Channel Features with Multi-Head Attention
Viaarxiv icon