Alert button
Picture for Dian Li

Dian Li

Alert button

Vision-Language Instruction Tuning: A Review and Analysis

Add code
Bookmark button
Alert button
Nov 25, 2023
Chen Li, Yixiao Ge, Dian Li, Ying Shan

Viaarxiv icon

HumTrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond

Add code
Bookmark button
Alert button
Sep 18, 2023
Shansong Liu, Xu Li, Dian Li, Ying Shan

Figure 1 for HumTrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond
Figure 2 for HumTrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond
Figure 3 for HumTrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond
Figure 4 for HumTrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond
Viaarxiv icon

TagGPT: Large Language Models are Zero-shot Multimodal Taggers

Add code
Bookmark button
Alert button
Apr 06, 2023
Chen Li, Yixiao Ge, Jiayong Mao, Dian Li, Ying Shan

Figure 1 for TagGPT: Large Language Models are Zero-shot Multimodal Taggers
Figure 2 for TagGPT: Large Language Models are Zero-shot Multimodal Taggers
Figure 3 for TagGPT: Large Language Models are Zero-shot Multimodal Taggers
Figure 4 for TagGPT: Large Language Models are Zero-shot Multimodal Taggers
Viaarxiv icon

Masked Visual Reconstruction in Language Semantic Space

Add code
Bookmark button
Alert button
Jan 17, 2023
Shusheng Yang, Yixiao Ge, Kun Yi, Dian Li, Ying Shan, Xiaohu Qie, Xinggang Wang

Figure 1 for Masked Visual Reconstruction in Language Semantic Space
Figure 2 for Masked Visual Reconstruction in Language Semantic Space
Figure 3 for Masked Visual Reconstruction in Language Semantic Space
Figure 4 for Masked Visual Reconstruction in Language Semantic Space
Viaarxiv icon

Masked Image Modeling with Denoising Contrast

Add code
Bookmark button
Alert button
May 19, 2022
Kun Yi, Yixiao Ge, Xiaotong Li, Shusheng Yang, Dian Li, Jianping Wu, Ying Shan, Xiaohu Qie

Figure 1 for Masked Image Modeling with Denoising Contrast
Figure 2 for Masked Image Modeling with Denoising Contrast
Figure 3 for Masked Image Modeling with Denoising Contrast
Figure 4 for Masked Image Modeling with Denoising Contrast
Viaarxiv icon

Controllable Augmentations for Video Representation Learning

Add code
Bookmark button
Alert button
Apr 01, 2022
Rui Qian, Weiyao Lin, John See, Dian Li

Figure 1 for Controllable Augmentations for Video Representation Learning
Figure 2 for Controllable Augmentations for Video Representation Learning
Figure 3 for Controllable Augmentations for Video Representation Learning
Figure 4 for Controllable Augmentations for Video Representation Learning
Viaarxiv icon

BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions

Add code
Bookmark button
Alert button
Jan 13, 2022
Yuying Ge, Yixiao Ge, Xihui Liu, Dian Li, Ying Shan, Xiaohu Qie, Ping Luo

Figure 1 for BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions
Figure 2 for BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions
Figure 3 for BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions
Figure 4 for BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions
Viaarxiv icon

CaSP: Class-agnostic Semi-Supervised Pretraining for Detection and Segmentation

Add code
Bookmark button
Alert button
Dec 09, 2021
Lu Qi, Jason Kuen, Zhe Lin, Jiuxiang Gu, Fengyun Rao, Dian Li, Weidong Guo, Zhen Wen, Jiaya Jia

Figure 1 for CaSP: Class-agnostic Semi-Supervised Pretraining for Detection and Segmentation
Figure 2 for CaSP: Class-agnostic Semi-Supervised Pretraining for Detection and Segmentation
Figure 3 for CaSP: Class-agnostic Semi-Supervised Pretraining for Detection and Segmentation
Figure 4 for CaSP: Class-agnostic Semi-Supervised Pretraining for Detection and Segmentation
Viaarxiv icon

CLIP4Caption ++: Multi-CLIP for Video Caption

Add code
Bookmark button
Alert button
Oct 14, 2021
Mingkang Tang, Zhanyu Wang, Zhaoyang Zeng, Fengyun Rao, Dian Li

Figure 1 for CLIP4Caption ++: Multi-CLIP for Video Caption
Figure 2 for CLIP4Caption ++: Multi-CLIP for Video Caption
Figure 3 for CLIP4Caption ++: Multi-CLIP for Video Caption
Viaarxiv icon