Picture for Dian Li

Dian Li

MetaTool: Facilitating Large Language Models to Master Tools with Meta-task Augmentation

Add code
Jul 15, 2024
Viaarxiv icon

Vision-Language Instruction Tuning: A Review and Analysis

Add code
Nov 25, 2023
Figure 1 for Vision-Language Instruction Tuning: A Review and Analysis
Figure 2 for Vision-Language Instruction Tuning: A Review and Analysis
Figure 3 for Vision-Language Instruction Tuning: A Review and Analysis
Figure 4 for Vision-Language Instruction Tuning: A Review and Analysis
Viaarxiv icon

HumTrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond

Add code
Sep 18, 2023
Figure 1 for HumTrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond
Figure 2 for HumTrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond
Figure 3 for HumTrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond
Figure 4 for HumTrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond
Viaarxiv icon

TagGPT: Large Language Models are Zero-shot Multimodal Taggers

Add code
Apr 06, 2023
Figure 1 for TagGPT: Large Language Models are Zero-shot Multimodal Taggers
Figure 2 for TagGPT: Large Language Models are Zero-shot Multimodal Taggers
Figure 3 for TagGPT: Large Language Models are Zero-shot Multimodal Taggers
Figure 4 for TagGPT: Large Language Models are Zero-shot Multimodal Taggers
Viaarxiv icon

Masked Visual Reconstruction in Language Semantic Space

Add code
Jan 17, 2023
Figure 1 for Masked Visual Reconstruction in Language Semantic Space
Figure 2 for Masked Visual Reconstruction in Language Semantic Space
Figure 3 for Masked Visual Reconstruction in Language Semantic Space
Figure 4 for Masked Visual Reconstruction in Language Semantic Space
Viaarxiv icon

Masked Image Modeling with Denoising Contrast

Add code
May 19, 2022
Figure 1 for Masked Image Modeling with Denoising Contrast
Figure 2 for Masked Image Modeling with Denoising Contrast
Figure 3 for Masked Image Modeling with Denoising Contrast
Figure 4 for Masked Image Modeling with Denoising Contrast
Viaarxiv icon

Controllable Augmentations for Video Representation Learning

Add code
Apr 01, 2022
Figure 1 for Controllable Augmentations for Video Representation Learning
Figure 2 for Controllable Augmentations for Video Representation Learning
Figure 3 for Controllable Augmentations for Video Representation Learning
Figure 4 for Controllable Augmentations for Video Representation Learning
Viaarxiv icon

BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions

Add code
Jan 13, 2022
Figure 1 for BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions
Figure 2 for BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions
Figure 3 for BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions
Figure 4 for BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions
Viaarxiv icon

CaSP: Class-agnostic Semi-Supervised Pretraining for Detection and Segmentation

Add code
Dec 09, 2021
Figure 1 for CaSP: Class-agnostic Semi-Supervised Pretraining for Detection and Segmentation
Figure 2 for CaSP: Class-agnostic Semi-Supervised Pretraining for Detection and Segmentation
Figure 3 for CaSP: Class-agnostic Semi-Supervised Pretraining for Detection and Segmentation
Figure 4 for CaSP: Class-agnostic Semi-Supervised Pretraining for Detection and Segmentation
Viaarxiv icon

CLIP4Caption ++: Multi-CLIP for Video Caption

Add code
Oct 14, 2021
Figure 1 for CLIP4Caption ++: Multi-CLIP for Video Caption
Figure 2 for CLIP4Caption ++: Multi-CLIP for Video Caption
Figure 3 for CLIP4Caption ++: Multi-CLIP for Video Caption
Viaarxiv icon