Alert button
Picture for Xudong Lin

Xudong Lin

Alert button

SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos

Add code
Bookmark button
Alert button
Mar 03, 2024
Yulei Niu, Wenliang Guo, Long Chen, Xudong Lin, Shih-Fu Chang

Figure 1 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Figure 2 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Figure 3 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Figure 4 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Viaarxiv icon

Exploring the Reasoning Abilities of Multimodal Large Language Models (MLLMs): A Comprehensive Survey on Emerging Trends in Multimodal Reasoning

Add code
Bookmark button
Alert button
Jan 18, 2024
Yiqi Wang, Wentao Chen, Xiaotian Han, Xudong Lin, Haiteng Zhao, Yongfei Liu, Bohan Zhai, Jianbo Yuan, Quanzeng You, Hongxia Yang

Viaarxiv icon

InfiMM-Eval: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models

Add code
Bookmark button
Alert button
Dec 04, 2023
Xiaotian Han, Quanzeng You, Yongfei Liu, Wentao Chen, Huangjie Zheng, Khalil Mrini, Xudong Lin, Yiqi Wang, Bohan Zhai, Jianbo Yuan, Heng Wang, Hongxia Yang

Viaarxiv icon

Video Summarization: Towards Entity-Aware Captions

Add code
Bookmark button
Alert button
Dec 01, 2023
Hammad A. Ayyubi, Tianqi Liu, Arsha Nagrani, Xudong Lin, Mingda Zhang, Anurag Arnab, Feng Han, Yukun Zhu, Jialu Liu, Shih-Fu Chang

Viaarxiv icon

CORE-MM: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models

Add code
Bookmark button
Alert button
Nov 27, 2023
Xiaotian Han, Quanzeng You, Yongfei Liu, Wentao Chen, Huangjie Zheng, Khalil Mrini, Xudong Lin, Yiqi Wang, Bohan Zhai, Jianbo Yuan, Heng Wang, Hongxia Yang

Viaarxiv icon

Non-Sequential Graph Script Induction via Multimedia Grounding

Add code
Bookmark button
Alert button
May 27, 2023
Yu Zhou, Sha Li, Manling Li, Xudong Lin, Shih-Fu Chang, Mohit Bansal, Heng Ji

Figure 1 for Non-Sequential Graph Script Induction via Multimedia Grounding
Figure 2 for Non-Sequential Graph Script Induction via Multimedia Grounding
Figure 3 for Non-Sequential Graph Script Induction via Multimedia Grounding
Figure 4 for Non-Sequential Graph Script Induction via Multimedia Grounding
Viaarxiv icon

Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering

Add code
Bookmark button
Alert button
Apr 07, 2023
Hung-Ting Su, Yulei Niu, Xudong Lin, Winston H. Hsu, Shih-Fu Chang

Figure 1 for Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering
Figure 2 for Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering
Figure 3 for Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering
Figure 4 for Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering
Viaarxiv icon

Supervised Masked Knowledge Distillation for Few-Shot Transformers

Add code
Bookmark button
Alert button
Mar 29, 2023
Han Lin, Guangxing Han, Jiawei Ma, Shiyuan Huang, Xudong Lin, Shih-Fu Chang

Figure 1 for Supervised Masked Knowledge Distillation for Few-Shot Transformers
Figure 2 for Supervised Masked Knowledge Distillation for Few-Shot Transformers
Figure 3 for Supervised Masked Knowledge Distillation for Few-Shot Transformers
Figure 4 for Supervised Masked Knowledge Distillation for Few-Shot Transformers
Viaarxiv icon