Alert button
Picture for Yuexian Zou

Yuexian Zou

Alert button

VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding

Mar 14, 2024
Chris Kelly, Luhui Hu, Jiayin Hu, Yu Tian, Deshun Yang, Bang Yang, Cindy Yang, Zihao Li, Zaoshan Huang, Yuexian Zou

Viaarxiv icon

VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework

Mar 14, 2024
Chris Kelly, Luhui Hu, Bang Yang, Yu Tian, Deshun Yang, Cindy Yang, Zaoshan Huang, Zihao Li, Jiayin Hu, Yuexian Zou

Viaarxiv icon

WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs

Mar 10, 2024
Deshun Yang, Luhui Hu, Yu Tian, Zihao Li, Chris Kelly, Bang Yang, Cindy Yang, Yuexian Zou

Viaarxiv icon

Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection

Mar 02, 2024
Chenchen Tao, Chong Wang, Yuexian Zou, Xiaohao Peng, Jiafei Wu, Jiangbo Qian

Figure 1 for Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection
Figure 2 for Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection
Figure 3 for Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection
Figure 4 for Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection
Viaarxiv icon

Retrieval is Accurate Generation

Feb 29, 2024
Bowen Cao, Deng Cai, Leyang Cui, Xuxin Cheng, Wei Bi, Yuexian Zou, Shuming Shi

Viaarxiv icon

Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning

Jan 30, 2024
Bang Yang, Yong Dai, Xuxin Cheng, Yaowei Li, Asif Raza, Yuexian Zou

Viaarxiv icon

ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding

Nov 19, 2023
Xuxin Cheng, Bowen Cao, Qichen Ye, Zhihong Zhu, Hongxiang Li, Yuexian Zou

Viaarxiv icon

UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework

Nov 16, 2023
Chris Kelly, Luhui Hu, Cindy Yang, Yu Tian, Deshun Yang, Bang Yang, Zaoshan Huang, Zihao Li, Yuexian Zou

Viaarxiv icon

Video Referring Expression Comprehension via Transformer with Content-conditioned Query

Oct 25, 2023
Ji Jiang, Meng Cao, Tengtao Song, Long Chen, Yi Wang, Yuexian Zou

Figure 1 for Video Referring Expression Comprehension via Transformer with Content-conditioned Query
Figure 2 for Video Referring Expression Comprehension via Transformer with Content-conditioned Query
Figure 3 for Video Referring Expression Comprehension via Transformer with Content-conditioned Query
Figure 4 for Video Referring Expression Comprehension via Transformer with Content-conditioned Query
Viaarxiv icon

NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement

Sep 03, 2023
Wen Wang, Dongchao Yang, Qichen Ye, Bowen Cao, Yuexian Zou

Figure 1 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Figure 2 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Figure 3 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Figure 4 for NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
Viaarxiv icon