Alert button
Picture for Yuexian Zou

Yuexian Zou

Alert button

Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning

Jan 30, 2024
Bang Yang, Yong Dai, Xuxin Cheng, Yaowei Li, Asif Raza, Yuexian Zou

Viaarxiv icon

ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding

Nov 19, 2023
Xuxin Cheng, Bowen Cao, Qichen Ye, Zhihong Zhu, Hongxiang Li, Yuexian Zou

Viaarxiv icon

UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework

Nov 16, 2023
Chris Kelly, Luhui Hu, Cindy Yang, Yu Tian, Deshun Yang, Bang Yang, Zaoshan Huang, Zihao Li, Yuexian Zou

Viaarxiv icon

Video Referring Expression Comprehension via Transformer with Content-conditioned Query

Oct 25, 2023
Ji Jiang, Meng Cao, Tengtao Song, Long Chen, Yi Wang, Yuexian Zou

Figure 1 for Video Referring Expression Comprehension via Transformer with Content-conditioned Query
Figure 2 for Video Referring Expression Comprehension via Transformer with Content-conditioned Query
Figure 3 for Video Referring Expression Comprehension via Transformer with Content-conditioned Query
Figure 4 for Video Referring Expression Comprehension via Transformer with Content-conditioned Query
Viaarxiv icon

NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement

Sep 03, 2023
Wen Wang, Dongchao Yang, Qichen Ye, Bowen Cao, Yuexian Zou

Figure 1 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Figure 2 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Figure 3 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Figure 4 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Viaarxiv icon

MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning

Aug 25, 2023
Bang Yang, Fenglin Liu, Xian Wu, Yaowei Wang, Xu Sun, Yuexian Zou

Figure 1 for MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
Figure 2 for MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
Figure 3 for MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
Figure 4 for MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
Viaarxiv icon

G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory

Aug 18, 2023
Hongxiang Li, Meng Cao, Xuxin Cheng, Yaowei Li, Zhihong Zhu, Yuexian Zou

Figure 1 for G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory
Figure 2 for G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory
Figure 3 for G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory
Figure 4 for G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory
Viaarxiv icon

Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions

Jul 28, 2023
Yifei Xin, Yuexian Zou

Figure 1 for Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions
Figure 2 for Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions
Figure 3 for Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions
Figure 4 for Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions
Viaarxiv icon

Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels

Jul 05, 2023
Bang Yang, Fenglin Liu, Zheng Li, Qingyu Yin, Chenyu You, Bing Yin, Yuexian Zou

Figure 1 for Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Figure 2 for Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Figure 3 for Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Figure 4 for Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Viaarxiv icon