Alert button
Picture for Jianwei Yang

Jianwei Yang

Alert button

GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation

Add code
Bookmark button
Alert button
Nov 13, 2023
An Yan, Zhengyuan Yang, Wanrong Zhu, Kevin Lin, Linjie Li, Jianfeng Wang, Jianwei Yang, Yiwu Zhong, Julian McAuley, Jianfeng Gao, Zicheng Liu, Lijuan Wang

Viaarxiv icon

LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Add code
Bookmark button
Alert button
Nov 09, 2023
Shilong Liu, Hao Cheng, Haotian Liu, Hao Zhang, Feng Li, Tianhe Ren, Xueyan Zou, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang, Jianfeng Gao, Chunyuan Li

Viaarxiv icon

LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Add code
Bookmark button
Alert button
Nov 01, 2023
Wei-Ge Chen, Irina Spiridonova, Jianwei Yang, Jianfeng Gao, Chunyuan Li

Viaarxiv icon

BiomedJourney: Counterfactual Biomedical Image Generation by Instruction-Learning from Multimodal Patient Journeys

Add code
Bookmark button
Alert button
Oct 21, 2023
Yu Gu, Jianwei Yang, Naoto Usuyama, Chunyuan Li, Sheng Zhang, Matthew P. Lungren, Jianfeng Gao, Hoifung Poon

Figure 1 for BiomedJourney: Counterfactual Biomedical Image Generation by Instruction-Learning from Multimodal Patient Journeys
Figure 2 for BiomedJourney: Counterfactual Biomedical Image Generation by Instruction-Learning from Multimodal Patient Journeys
Figure 3 for BiomedJourney: Counterfactual Biomedical Image Generation by Instruction-Learning from Multimodal Patient Journeys
Figure 4 for BiomedJourney: Counterfactual Biomedical Image Generation by Instruction-Learning from Multimodal Patient Journeys
Viaarxiv icon

LACMA: Language-Aligning Contrastive Learning with Meta-Actions for Embodied Instruction Following

Add code
Bookmark button
Alert button
Oct 18, 2023
Cheng-Fu Yang, Yen-Chun Chen, Jianwei Yang, Xiyang Dai, Lu Yuan, Yu-Chiang Frank Wang, Kai-Wei Chang

Viaarxiv icon

Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection

Add code
Bookmark button
Alert button
Oct 18, 2023
Lingchen Meng, Xiyang Dai, Jianwei Yang, Dongdong Chen, Yinpeng Chen, Mengchen Liu, Yi-Ling Chen, Zuxuan Wu, Lu Yuan, Yu-Gang Jiang

Figure 1 for Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection
Figure 2 for Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection
Figure 3 for Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection
Figure 4 for Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection
Viaarxiv icon

Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V

Add code
Bookmark button
Alert button
Oct 17, 2023
Jianwei Yang, Hao Zhang, Feng Li, Xueyan Zou, Chunyuan Li, Jianfeng Gao

Figure 1 for Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Figure 2 for Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Figure 3 for Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Figure 4 for Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Viaarxiv icon

Multimodal Foundation Models: From Specialists to General-Purpose Assistants

Add code
Bookmark button
Alert button
Sep 18, 2023
Chunyuan Li, Zhe Gan, Zhengyuan Yang, Jianwei Yang, Linjie Li, Lijuan Wang, Jianfeng Gao

Figure 1 for Multimodal Foundation Models: From Specialists to General-Purpose Assistants
Figure 2 for Multimodal Foundation Models: From Specialists to General-Purpose Assistants
Figure 3 for Multimodal Foundation Models: From Specialists to General-Purpose Assistants
Figure 4 for Multimodal Foundation Models: From Specialists to General-Purpose Assistants
Viaarxiv icon

An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

Add code
Bookmark button
Alert button
Sep 18, 2023
Yadong Lu, Chunyuan Li, Haotian Liu, Jianwei Yang, Jianfeng Gao, Yelong Shen

Figure 1 for An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
Figure 2 for An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
Figure 3 for An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
Figure 4 for An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
Viaarxiv icon