Alert button
Picture for Shih-Fu Chang

Shih-Fu Chang

Alert button

From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models

Mar 18, 2024
Kung-Hsiang Huang, Hou Pong Chan, Yi R. Fung, Haoyi Qiu, Mingyang Zhou, Shafiq Joty, Shih-Fu Chang, Heng Ji

Viaarxiv icon

SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos

Mar 03, 2024
Yulei Niu, Wenliang Guo, Long Chen, Xudong Lin, Shih-Fu Chang

Figure 1 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Figure 2 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Figure 3 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Figure 4 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Viaarxiv icon

Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning

Dec 15, 2023
Kung-Hsiang Huang, Mingyang Zhou, Hou Pong Chan, Yi R. Fung, Zhenhailong Wang, Lingyu Zhang, Shih-Fu Chang, Heng Ji

Viaarxiv icon

Video Summarization: Towards Entity-Aware Captions

Dec 01, 2023
Hammad A. Ayyubi, Tianqi Liu, Arsha Nagrani, Xudong Lin, Mingda Zhang, Anurag Arnab, Feng Han, Yukun Zhu, Jialu Liu, Shih-Fu Chang

Viaarxiv icon

Characterizing Video Question Answering with Sparsified Inputs

Nov 27, 2023
Shiyuan Huang, Robinson Piramuthu, Vicente Ordonez, Shih-Fu Chang, Gunnar A. Sigurdsson

Viaarxiv icon

Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond

Oct 31, 2023
Zhecan Wang, Long Chen, Haoxuan You, Keyang Xu, Yicheng He, Wenhao Li, Noel Codella, Kai-Wei Chang, Shih-Fu Chang

Figure 1 for Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
Figure 2 for Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
Figure 3 for Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
Figure 4 for Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
Viaarxiv icon

Ferret: Refer and Ground Anything Anywhere at Any Granularity

Oct 11, 2023
Haoxuan You, Haotian Zhang, Zhe Gan, Xianzhi Du, Bowen Zhang, Zirui Wang, Liangliang Cao, Shih-Fu Chang, Yinfei Yang

Figure 1 for Ferret: Refer and Ground Anything Anywhere at Any Granularity
Figure 2 for Ferret: Refer and Ground Anything Anywhere at Any Granularity
Figure 3 for Ferret: Refer and Ground Anything Anywhere at Any Granularity
Figure 4 for Ferret: Refer and Ground Anything Anywhere at Any Granularity
Viaarxiv icon

UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding

Jul 03, 2023
Rui Sun, Zhecan Wang, Haoxuan You, Noel Codella, Kai-Wei Chang, Shih-Fu Chang

Figure 1 for UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding
Figure 2 for UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding
Figure 3 for UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding
Figure 4 for UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding
Viaarxiv icon

Learning from Children: Improving Image-Caption Pretraining via Curriculum

May 30, 2023
Hammad A. Ayyubi, Rahul Lokesh, Alireza Zareian, Bo Wu, Shih-Fu Chang

Figure 1 for Learning from Children: Improving Image-Caption Pretraining via Curriculum
Figure 2 for Learning from Children: Improving Image-Caption Pretraining via Curriculum
Figure 3 for Learning from Children: Improving Image-Caption Pretraining via Curriculum
Figure 4 for Learning from Children: Improving Image-Caption Pretraining via Curriculum
Viaarxiv icon