Alert button
Picture for Kunchang Li

Kunchang Li

Alert button

From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

Jan 29, 2024
Chaochao Lu, Chen Qian, Guodong Zheng, Hongxing Fan, Hongzhi Gao, Jie Zhang, Jing Shao, Jingyi Deng, Jinlan Fu, Kexin Huang, Kunchang Li, Lijun Li, Limin Wang, Lu Sheng, Meiqi Chen, Ming Zhang, Qibing Ren, Sirui Chen, Tao Gui, Wanli Ouyang, Yali Wang, Yan Teng, Yaru Wang, Yi Wang, Yinan He, Yingchun Wang, Yixu Wang, Yongting Zhang, Yu Qiao, Yujiong Shen, Yurong Mou, Yuxi Chen, Zaibin Zhang, Zhelun Shi, Zhenfei Yin, Zhipin Wang

Viaarxiv icon

Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks

Jan 25, 2024
Tianhe Ren, Shilong Liu, Ailing Zeng, Jing Lin, Kunchang Li, He Cao, Jiayu Chen, Xinyu Huang, Yukang Chen, Feng Yan, Zhaoyang Zeng, Hao Zhang, Feng Li, Jie Yang, Hongyang Li, Qing Jiang, Lei Zhang

Viaarxiv icon

Vlogger: Make Your Dream A Vlog

Jan 17, 2024
Shaobin Zhuang, Kunchang Li, Xinyuan Chen, Yaohui Wang, Ziwei Liu, Yu Qiao, Yali Wang

Viaarxiv icon

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark

Dec 03, 2023
Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, Yi Liu, Zun Wang, Jilan Xu, Guo Chen, Ping Luo, Limin Wang, Yu Qiao

Figure 1 for MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Figure 2 for MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Figure 3 for MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Figure 4 for MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Viaarxiv icon

Harvest Video Foundation Models via Efficient Post-Pretraining

Oct 30, 2023
Yizhuo Li, Kunchang Li, Yinan He, Yi Wang, Yali Wang, Limin Wang, Yu Qiao, Ping Luo

Viaarxiv icon

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation

Jul 13, 2023
Yi Wang, Yinan He, Yizhuo Li, Kunchang Li, Jiashuo Yu, Xin Ma, Xinyuan Chen, Yaohui Wang, Ping Luo, Ziwei Liu, Yali Wang, Limin Wang, Yu Qiao

Figure 1 for InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation
Figure 2 for InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation
Figure 3 for InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation
Figure 4 for InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation
Viaarxiv icon

InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language

May 11, 2023
Zhaoyang Liu, Yinan He, Wenhai Wang, Weiyun Wang, Yi Wang, Shoufa Chen, Qinglong Zhang, Yang Yang, Qingyun Li, Jiashuo Yu, Kunchang Li, Zhe Chen, Xue Yang, Xizhou Zhu, Yali Wang, Limin Wang, Ping Luo, Jifeng Dai, Yu Qiao

Figure 1 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Figure 2 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Figure 3 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Figure 4 for InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language
Viaarxiv icon

Unmasked Teacher: Towards Training-Efficient Video Foundation Models

Mar 28, 2023
Kunchang Li, Yali Wang, Yizhuo Li, Yi Wang, Yinan He, Limin Wang, Yu Qiao

Figure 1 for Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Figure 2 for Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Figure 3 for Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Figure 4 for Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Viaarxiv icon

InternVideo: General Video Foundation Models via Generative and Discriminative Learning

Dec 07, 2022
Yi Wang, Kunchang Li, Yizhuo Li, Yinan He, Bingkun Huang, Zhiyu Zhao, Hongjie Zhang, Jilan Xu, Yi Liu, Zun Wang, Sen Xing, Guo Chen, Junting Pan, Jiashuo Yu, Yali Wang, Limin Wang, Yu Qiao

Figure 1 for InternVideo: General Video Foundation Models via Generative and Discriminative Learning
Figure 2 for InternVideo: General Video Foundation Models via Generative and Discriminative Learning
Figure 3 for InternVideo: General Video Foundation Models via Generative and Discriminative Learning
Figure 4 for InternVideo: General Video Foundation Models via Generative and Discriminative Learning
Viaarxiv icon