Alert button
Picture for Chaoyou Fu

Chaoyou Fu

Alert button

A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise

Dec 20, 2023
Chaoyou Fu, Renrui Zhang, Zihan Wang, Yubo Huang, Zhengye Zhang, Longtian Qiu, Gaoxiang Ye, Yunhang Shen, Mengdan Zhang, Peixian Chen, Sirui Zhao, Shaohui Lin, Deqiang Jiang, Di Yin, Peng Gao, Ke Li, Hongsheng Li, Xing Sun

Viaarxiv icon

Aligning and Prompting Everything All at Once for Universal Visual Perception

Dec 04, 2023
Yunhang Shen, Chaoyou Fu, Peixian Chen, Mengdan Zhang, Ke Li, Xing Sun, Yunsheng Wu, Shaohui Lin, Rongrong Ji

Viaarxiv icon

ChatIllusion: Efficient-Aligning Interleaved Generation ability with Visual Instruction Model

Nov 29, 2023
Xiaowei Chi, Yijiang Liu, Zhengkai Jiang, Rongyu Zhang, Ziyi Lin, Renrui Zhang, Peng Gao, Chaoyou Fu, Shanghang Zhang, Qifeng Liu, Yike Guo

Viaarxiv icon

Woodpecker: Hallucination Correction for Multimodal Large Language Models

Oct 24, 2023
Shukang Yin, Chaoyou Fu, Sirui Zhao, Tong Xu, Hao Wang, Dianbo Sui, Yunhang Shen, Ke Li, Xing Sun, Enhong Chen

Figure 1 for Woodpecker: Hallucination Correction for Multimodal Large Language Models
Figure 2 for Woodpecker: Hallucination Correction for Multimodal Large Language Models
Figure 3 for Woodpecker: Hallucination Correction for Multimodal Large Language Models
Figure 4 for Woodpecker: Hallucination Correction for Multimodal Large Language Models
Viaarxiv icon

CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes

Oct 15, 2023
Yulei Qin, Xingyu Chen, Yunhang Shen, Chaoyou Fu, Yun Gu, Ke Li, Xing Sun, Rongrong Ji

Figure 1 for CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes
Figure 2 for CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes
Figure 3 for CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes
Figure 4 for CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes
Viaarxiv icon

Audio-Driven Dubbing for User Generated Contents via Style-Aware Semi-Parametric Synthesis

Aug 31, 2023
Linsen Song, Wayne Wu, Chaoyou Fu, Chen Change Loy, Ran He

Figure 1 for Audio-Driven Dubbing for User Generated Contents via Style-Aware Semi-Parametric Synthesis
Figure 2 for Audio-Driven Dubbing for User Generated Contents via Style-Aware Semi-Parametric Synthesis
Figure 3 for Audio-Driven Dubbing for User Generated Contents via Style-Aware Semi-Parametric Synthesis
Figure 4 for Audio-Driven Dubbing for User Generated Contents via Style-Aware Semi-Parametric Synthesis
Viaarxiv icon

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

Jul 02, 2023
Chaoyou Fu, Peixian Chen, Yunhang Shen, Yulei Qin, Mengdan Zhang, Xu Lin, Zhenyu Qiu, Wei Lin, Jinrui Yang, Xiawu Zheng, Ke Li, Xing Sun, Rongrong Ji

Figure 1 for MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Figure 2 for MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Figure 3 for MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Figure 4 for MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Viaarxiv icon

A Survey on Multimodal Large Language Models

Jun 23, 2023
Shukang Yin, Chaoyou Fu, Sirui Zhao, Ke Li, Xing Sun, Tong Xu, Enhong Chen

Figure 1 for A Survey on Multimodal Large Language Models
Figure 2 for A Survey on Multimodal Large Language Models
Figure 3 for A Survey on Multimodal Large Language Models
Figure 4 for A Survey on Multimodal Large Language Models
Viaarxiv icon

Multi-modal Queried Object Detection in the Wild

May 30, 2023
Yifan Xu, Mengdan Zhang, Chaoyou Fu, Peixian Chen, Xiaoshan Yang, Ke Li, Changsheng Xu

Figure 1 for Multi-modal Queried Object Detection in the Wild
Figure 2 for Multi-modal Queried Object Detection in the Wild
Figure 3 for Multi-modal Queried Object Detection in the Wild
Figure 4 for Multi-modal Queried Object Detection in the Wild
Viaarxiv icon