Picture for Liunian Harold Li

Liunian Harold Li

Matryoshka Query Transformer for Large Vision-Language Models

Add code
May 29, 2024
Viaarxiv icon

Tailoring Self-Rationalizers with Multi-Reward Distillation

Add code
Nov 06, 2023
Viaarxiv icon

DesCo: Learning Object Recognition with Rich Language Descriptions

Add code
Jun 24, 2023
Figure 1 for DesCo: Learning Object Recognition with Rich Language Descriptions
Figure 2 for DesCo: Learning Object Recognition with Rich Language Descriptions
Figure 3 for DesCo: Learning Object Recognition with Rich Language Descriptions
Figure 4 for DesCo: Learning Object Recognition with Rich Language Descriptions
Viaarxiv icon

Symbolic Chain-of-Thought Distillation: Small Models Can Also "Think" Step-by-Step

Add code
Jun 24, 2023
Figure 1 for Symbolic Chain-of-Thought Distillation: Small Models Can Also "Think" Step-by-Step
Figure 2 for Symbolic Chain-of-Thought Distillation: Small Models Can Also "Think" Step-by-Step
Figure 3 for Symbolic Chain-of-Thought Distillation: Small Models Can Also "Think" Step-by-Step
Figure 4 for Symbolic Chain-of-Thought Distillation: Small Models Can Also "Think" Step-by-Step
Viaarxiv icon

MetaVL: Transferring In-Context Learning Ability From Language Models to Vision-Language Models

Add code
Jun 02, 2023
Figure 1 for MetaVL: Transferring In-Context Learning Ability From Language Models to Vision-Language Models
Figure 2 for MetaVL: Transferring In-Context Learning Ability From Language Models to Vision-Language Models
Figure 3 for MetaVL: Transferring In-Context Learning Ability From Language Models to Vision-Language Models
Figure 4 for MetaVL: Transferring In-Context Learning Ability From Language Models to Vision-Language Models
Viaarxiv icon

GLIPv2: Unifying Localization and Vision-Language Understanding

Add code
Jun 12, 2022
Figure 1 for GLIPv2: Unifying Localization and Vision-Language Understanding
Figure 2 for GLIPv2: Unifying Localization and Vision-Language Understanding
Figure 3 for GLIPv2: Unifying Localization and Vision-Language Understanding
Figure 4 for GLIPv2: Unifying Localization and Vision-Language Understanding
Viaarxiv icon

DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally Spreading Out Disinformation

Add code
May 25, 2022
Figure 1 for DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally Spreading Out Disinformation
Figure 2 for DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally Spreading Out Disinformation
Figure 3 for DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally Spreading Out Disinformation
Figure 4 for DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally Spreading Out Disinformation
Viaarxiv icon

On the Paradox of Learning to Reason from Data

Add code
May 24, 2022
Figure 1 for On the Paradox of Learning to Reason from Data
Figure 2 for On the Paradox of Learning to Reason from Data
Figure 3 for On the Paradox of Learning to Reason from Data
Figure 4 for On the Paradox of Learning to Reason from Data
Viaarxiv icon

GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models

Add code
May 24, 2022
Figure 1 for GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models
Figure 2 for GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models
Figure 3 for GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models
Figure 4 for GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models
Viaarxiv icon

ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models

Add code
Apr 20, 2022
Figure 1 for ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Figure 2 for ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Figure 3 for ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Figure 4 for ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Viaarxiv icon