Picture for Jialin Wu

Jialin Wu

GeomVerse: A Systematic Evaluation of Large Models for Geometric Reasoning

Add code
Dec 19, 2023
Viaarxiv icon

Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts

Add code
Dec 01, 2023
Viaarxiv icon

Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling

Add code
Oct 18, 2023
Figure 1 for Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling
Figure 2 for Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling
Figure 3 for Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling
Figure 4 for Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling
Viaarxiv icon

PaLI-3 Vision Language Models: Smaller, Faster, Stronger

Add code
Oct 17, 2023
Figure 1 for PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Figure 2 for PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Figure 3 for PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Figure 4 for PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Viaarxiv icon

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Add code
Oct 17, 2023
Figure 1 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 2 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 3 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 4 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Viaarxiv icon

CausalLM is not optimal for in-context learning

Add code
Sep 03, 2023
Figure 1 for CausalLM is not optimal for in-context learning
Figure 2 for CausalLM is not optimal for in-context learning
Figure 3 for CausalLM is not optimal for in-context learning
Figure 4 for CausalLM is not optimal for in-context learning
Viaarxiv icon

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Add code
Jul 28, 2023
Viaarxiv icon

PaLI-X: On Scaling up a Multilingual Vision and Language Model

Add code
May 29, 2023
Figure 1 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 2 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 3 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 4 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Viaarxiv icon

Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering

Add code
Oct 18, 2022
Figure 1 for Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering
Figure 2 for Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering
Figure 3 for Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering
Figure 4 for Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering
Viaarxiv icon

Multi-Modal Answer Validation for Knowledge-Based VQA

Add code
Mar 23, 2021
Figure 1 for Multi-Modal Answer Validation for Knowledge-Based VQA
Figure 2 for Multi-Modal Answer Validation for Knowledge-Based VQA
Figure 3 for Multi-Modal Answer Validation for Knowledge-Based VQA
Figure 4 for Multi-Modal Answer Validation for Knowledge-Based VQA
Viaarxiv icon