Alert button
Picture for Shih-Fu Chang

Shih-Fu Chang

Alert button

Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models

Add code
Bookmark button
Alert button
Apr 11, 2024
Haotian Zhang, Haoxuan You, Philipp Dufter, Bowen Zhang, Chen Chen, Hong-You Chen, Tsu-Jui Fu, William Yang Wang, Shih-Fu Chang, Zhe Gan, Yinfei Yang

Viaarxiv icon

From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models

Add code
Bookmark button
Alert button
Mar 25, 2024
Kung-Hsiang Huang, Hou Pong Chan, Yi R. Fung, Haoyi Qiu, Mingyang Zhou, Shafiq Joty, Shih-Fu Chang, Heng Ji

Figure 1 for From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models
Figure 2 for From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models
Figure 3 for From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models
Figure 4 for From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models
Viaarxiv icon

SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos

Add code
Bookmark button
Alert button
Mar 03, 2024
Yulei Niu, Wenliang Guo, Long Chen, Xudong Lin, Shih-Fu Chang

Figure 1 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Figure 2 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Figure 3 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Figure 4 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Viaarxiv icon

Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning

Add code
Bookmark button
Alert button
Dec 15, 2023
Kung-Hsiang Huang, Mingyang Zhou, Hou Pong Chan, Yi R. Fung, Zhenhailong Wang, Lingyu Zhang, Shih-Fu Chang, Heng Ji

Viaarxiv icon

Video Summarization: Towards Entity-Aware Captions

Add code
Bookmark button
Alert button
Dec 01, 2023
Hammad A. Ayyubi, Tianqi Liu, Arsha Nagrani, Xudong Lin, Mingda Zhang, Anurag Arnab, Feng Han, Yukun Zhu, Jialu Liu, Shih-Fu Chang

Viaarxiv icon

Characterizing Video Question Answering with Sparsified Inputs

Add code
Bookmark button
Alert button
Nov 27, 2023
Shiyuan Huang, Robinson Piramuthu, Vicente Ordonez, Shih-Fu Chang, Gunnar A. Sigurdsson

Viaarxiv icon

Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond

Add code
Bookmark button
Alert button
Oct 31, 2023
Zhecan Wang, Long Chen, Haoxuan You, Keyang Xu, Yicheng He, Wenhao Li, Noel Codella, Kai-Wei Chang, Shih-Fu Chang

Figure 1 for Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
Figure 2 for Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
Figure 3 for Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
Figure 4 for Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
Viaarxiv icon

Ferret: Refer and Ground Anything Anywhere at Any Granularity

Add code
Bookmark button
Alert button
Oct 11, 2023
Haoxuan You, Haotian Zhang, Zhe Gan, Xianzhi Du, Bowen Zhang, Zirui Wang, Liangliang Cao, Shih-Fu Chang, Yinfei Yang

Figure 1 for Ferret: Refer and Ground Anything Anywhere at Any Granularity
Figure 2 for Ferret: Refer and Ground Anything Anywhere at Any Granularity
Figure 3 for Ferret: Refer and Ground Anything Anywhere at Any Granularity
Figure 4 for Ferret: Refer and Ground Anything Anywhere at Any Granularity
Viaarxiv icon