Alert button
Picture for Anwen Hu

Anwen Hu

Alert button

mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model

Nov 30, 2023
Anwen Hu, Yaya Shi, Haiyang Xu, Jiabo Ye, Qinghao Ye, Ming Yan, Chenliang Li, Qi Qian, Ji Zhang, Fei Huang

Viaarxiv icon

mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration

Nov 09, 2023
Qinghao Ye, Haiyang Xu, Jiabo Ye, Ming Yan, Anwen Hu, Haowei Liu, Qi Qian, Ji Zhang, Fei Huang, Jingren Zhou

Viaarxiv icon

UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model

Oct 08, 2023
Jiabo Ye, Anwen Hu, Haiyang Xu, Qinghao Ye, Ming Yan, Guohai Xu, Chenliang Li, Junfeng Tian, Qi Qian, Ji Zhang, Qin Jin, Liang He, Xin Alex Lin, Fei Huang

Figure 1 for UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
Figure 2 for UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
Figure 3 for UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
Figure 4 for UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
Viaarxiv icon

Explore and Tell: Embodied Visual Captioning in 3D Environments

Aug 21, 2023
Anwen Hu, Shizhe Chen, Liang Zhang, Qin Jin

Figure 1 for Explore and Tell: Embodied Visual Captioning in 3D Environments
Figure 2 for Explore and Tell: Embodied Visual Captioning in 3D Environments
Figure 3 for Explore and Tell: Embodied Visual Captioning in 3D Environments
Figure 4 for Explore and Tell: Embodied Visual Captioning in 3D Environments
Viaarxiv icon

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Jul 04, 2023
Jiabo Ye, Anwen Hu, Haiyang Xu, Qinghao Ye, Ming Yan, Yuhao Dan, Chenlin Zhao, Guohai Xu, Chenliang Li, Junfeng Tian, Qian Qi, Ji Zhang, Fei Huang

Figure 1 for mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Figure 2 for mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Figure 3 for mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Figure 4 for mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Viaarxiv icon

Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation

Jun 27, 2023
Zihao Yue, Anwen Hu, Liang Zhang, Qin Jin

Viaarxiv icon

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks

Jun 07, 2023
Haiyang Xu, Qinghao Ye, Xuan Wu, Ming Yan, Yuan Miao, Jiabo Ye, Guohai Xu, Anwen Hu, Yaya Shi, Guangwei Xu, Chenliang Li, Qi Qian, Maofei Que, Ji Zhang, Xiao Zeng, Fei Huang

Figure 1 for Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks
Figure 2 for Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks
Figure 3 for Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks
Figure 4 for Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks
Viaarxiv icon

Movie101: A New Movie Understanding Benchmark

May 20, 2023
Zihao Yue, Qi Zhang, Anwen Hu, Liang Zhang, Ziheng Wang, Qin Jin

Figure 1 for Movie101: A New Movie Understanding Benchmark
Figure 2 for Movie101: A New Movie Understanding Benchmark
Figure 3 for Movie101: A New Movie Understanding Benchmark
Figure 4 for Movie101: A New Movie Understanding Benchmark
Viaarxiv icon

InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation

May 10, 2023
Anwen Hu, Shizhe Chen, Liang Zhang, Qin Jin

Figure 1 for InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation
Figure 2 for InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation
Figure 3 for InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation
Figure 4 for InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation
Viaarxiv icon