Alert button
Picture for Junnan Li

Junnan Li

Alert button

What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases

Add code
Bookmark button
Alert button
Apr 03, 2024
Anthony Meng Huat Tiong, Junqi Zhao, Boyang Li, Junnan Li, Steven C. H. Hoi, Caiming Xiong

Viaarxiv icon

X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning

Add code
Bookmark button
Alert button
Nov 30, 2023
Artemis Panagopoulou, Le Xue, Ning Yu, Junnan Li, Dongxu Li, Shafiq Joty, Ran Xu, Silvio Savarese, Caiming Xiong, Juan Carlos Niebles

Viaarxiv icon

CodeTF: One-stop Transformer Library for State-of-the-art Code LLM

Add code
Bookmark button
Alert button
May 31, 2023
Nghi D. Q. Bui, Hung Le, Yue Wang, Junnan Li, Akhilesh Deepak Gotmare, Steven C. H. Hoi

Figure 1 for CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
Figure 2 for CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
Figure 3 for CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
Figure 4 for CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
Viaarxiv icon

BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing

Add code
Bookmark button
Alert button
May 24, 2023
Dongxu Li, Junnan Li, Steven C. H. Hoi

Figure 1 for BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
Figure 2 for BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
Figure 3 for BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
Figure 4 for BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
Viaarxiv icon

CodeT5+: Open Code Large Language Models for Code Understanding and Generation

Add code
Bookmark button
Alert button
May 20, 2023
Yue Wang, Hung Le, Akhilesh Deepak Gotmare, Nghi D. Q. Bui, Junnan Li, Steven C. H. Hoi

Figure 1 for CodeT5+: Open Code Large Language Models for Code Understanding and Generation
Figure 2 for CodeT5+: Open Code Large Language Models for Code Understanding and Generation
Figure 3 for CodeT5+: Open Code Large Language Models for Code Understanding and Generation
Figure 4 for CodeT5+: Open Code Large Language Models for Code Understanding and Generation
Viaarxiv icon

ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding

Add code
Bookmark button
Alert button
May 18, 2023
Le Xue, Ning Yu, Shu Zhang, Junnan Li, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, Ran Xu, Juan Carlos Niebles, Silvio Savarese

Figure 1 for ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding
Figure 2 for ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding
Figure 3 for ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding
Figure 4 for ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding
Viaarxiv icon

ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding

Add code
Bookmark button
Alert button
May 14, 2023
Le Xue, Ning Yu, Shu Zhang, Junnan Li, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, Ran Xu, Juan Carlos Niebles, Silvio Savarese

Figure 1 for ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding
Figure 2 for ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding
Figure 3 for ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding
Figure 4 for ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding
Viaarxiv icon

InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning

Add code
Bookmark button
Alert button
May 11, 2023
Wenliang Dai, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Junqi Zhao, Weisheng Wang, Boyang Li, Pascale Fung, Steven Hoi

Figure 1 for InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Figure 2 for InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Figure 3 for InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Figure 4 for InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Viaarxiv icon

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

Add code
Bookmark button
Alert button
Jan 30, 2023
Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi

Figure 1 for BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Figure 2 for BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Figure 3 for BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Figure 4 for BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Viaarxiv icon