Alert button
Picture for Haoxuan You

Haoxuan You

Alert button

Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models

Add code
Bookmark button
Alert button
Apr 11, 2024
Haotian Zhang, Haoxuan You, Philipp Dufter, Bowen Zhang, Chen Chen, Hong-You Chen, Tsu-Jui Fu, William Yang Wang, Shih-Fu Chang, Zhe Gan, Yinfei Yang

Viaarxiv icon

LLM-based Conversational AI Therapist for Daily Functioning Screening and Psychotherapeutic Intervention via Everyday Smart Devices

Add code
Bookmark button
Alert button
Mar 16, 2024
Jingping Nie, Hanya Shao, Yuang Fan, Qijia Shao, Haoxuan You, Matthias Preindl, Xiaofan Jiang

Figure 1 for LLM-based Conversational AI Therapist for Daily Functioning Screening and Psychotherapeutic Intervention via Everyday Smart Devices
Figure 2 for LLM-based Conversational AI Therapist for Daily Functioning Screening and Psychotherapeutic Intervention via Everyday Smart Devices
Figure 3 for LLM-based Conversational AI Therapist for Daily Functioning Screening and Psychotherapeutic Intervention via Everyday Smart Devices
Figure 4 for LLM-based Conversational AI Therapist for Daily Functioning Screening and Psychotherapeutic Intervention via Everyday Smart Devices
Viaarxiv icon

Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond

Add code
Bookmark button
Alert button
Oct 31, 2023
Zhecan Wang, Long Chen, Haoxuan You, Keyang Xu, Yicheng He, Wenhao Li, Noel Codella, Kai-Wei Chang, Shih-Fu Chang

Figure 1 for Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
Figure 2 for Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
Figure 3 for Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
Figure 4 for Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
Viaarxiv icon

Ferret: Refer and Ground Anything Anywhere at Any Granularity

Add code
Bookmark button
Alert button
Oct 11, 2023
Haoxuan You, Haotian Zhang, Zhe Gan, Xianzhi Du, Bowen Zhang, Zirui Wang, Liangliang Cao, Shih-Fu Chang, Yinfei Yang

Figure 1 for Ferret: Refer and Ground Anything Anywhere at Any Granularity
Figure 2 for Ferret: Refer and Ground Anything Anywhere at Any Granularity
Figure 3 for Ferret: Refer and Ground Anything Anywhere at Any Granularity
Figure 4 for Ferret: Refer and Ground Anything Anywhere at Any Granularity
Viaarxiv icon

UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding

Add code
Bookmark button
Alert button
Jul 03, 2023
Rui Sun, Zhecan Wang, Haoxuan You, Noel Codella, Kai-Wei Chang, Shih-Fu Chang

Figure 1 for UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding
Figure 2 for UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding
Figure 3 for UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding
Figure 4 for UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding
Viaarxiv icon

IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models

Add code
Bookmark button
Alert button
May 24, 2023
Haoxuan You, Rui Sun, Zhecan Wang, Long Chen, Gengyu Wang, Hammad A. Ayyubi, Kai-Wei Chang, Shih-Fu Chang

Figure 1 for IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models
Figure 2 for IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models
Figure 3 for IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models
Figure 4 for IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models
Viaarxiv icon

CoBIT: A Contrastive Bi-directional Image-Text Generation Model

Add code
Bookmark button
Alert button
Mar 23, 2023
Haoxuan You, Mandy Guo, Zhecan Wang, Kai-Wei Chang, Jason Baldridge, Jiahui Yu

Figure 1 for CoBIT: A Contrastive Bi-directional Image-Text Generation Model
Figure 2 for CoBIT: A Contrastive Bi-directional Image-Text Generation Model
Figure 3 for CoBIT: A Contrastive Bi-directional Image-Text Generation Model
Figure 4 for CoBIT: A Contrastive Bi-directional Image-Text Generation Model
Viaarxiv icon

Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding

Add code
Bookmark button
Alert button
Dec 14, 2022
Haoxuan You, Rui Sun, Zhecan Wang, Kai-Wei Chang, Shih-Fu Chang

Figure 1 for Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding
Figure 2 for Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding
Figure 3 for Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding
Figure 4 for Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding
Viaarxiv icon

Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense

Add code
Bookmark button
Alert button
Nov 10, 2022
Zhecan Wang, Haoxuan You, Yicheng He, Wenhao Li, Kai-Wei Chang, Shih-Fu Chang

Figure 1 for Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense
Figure 2 for Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense
Figure 3 for Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense
Figure 4 for Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense
Viaarxiv icon