Picture for Lei Li

Lei Li

Carnegie Mellon University

Deep Learning for Trajectory Data Management and Mining: A Survey and Beyond

Add code
Mar 21, 2024
Figure 1 for Deep Learning for Trajectory Data Management and Mining: A Survey and Beyond
Figure 2 for Deep Learning for Trajectory Data Management and Mining: A Survey and Beyond
Figure 3 for Deep Learning for Trajectory Data Management and Mining: A Survey and Beyond
Figure 4 for Deep Learning for Trajectory Data Management and Mining: A Survey and Beyond
Viaarxiv icon

Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition

Add code
Mar 19, 2024
Viaarxiv icon

Word Order's Impacts: Insights from Reordering and Generation Analysis

Add code
Mar 18, 2024
Viaarxiv icon

Tree Counting by Bridging 3D Point Clouds with Imagery

Add code
Mar 12, 2024
Figure 1 for Tree Counting by Bridging 3D Point Clouds with Imagery
Figure 2 for Tree Counting by Bridging 3D Point Clouds with Imagery
Figure 3 for Tree Counting by Bridging 3D Point Clouds with Imagery
Figure 4 for Tree Counting by Bridging 3D Point Clouds with Imagery
Viaarxiv icon

MedFLIP: Medical Vision-and-Language Self-supervised Fast Pre-Training with Masked Autoencoder

Add code
Mar 07, 2024
Viaarxiv icon

SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM

Add code
Mar 07, 2024
Figure 1 for SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Figure 2 for SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Figure 3 for SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Figure 4 for SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM
Viaarxiv icon

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

Add code
Mar 06, 2024
Viaarxiv icon

Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models

Add code
Mar 04, 2024
Figure 1 for Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models
Figure 2 for Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models
Figure 3 for Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models
Figure 4 for Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models
Viaarxiv icon

TempCompass: Do Video LLMs Really Understand Videos?

Add code
Mar 01, 2024
Figure 1 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 2 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 3 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 4 for TempCompass: Do Video LLMs Really Understand Videos?
Viaarxiv icon

Hire a Linguist!: Learning Endangered Languages with In-Context Linguistic Descriptions

Add code
Feb 28, 2024
Viaarxiv icon