Picture for Lianli Gao

Lianli Gao

Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization

Add code
May 24, 2024
Figure 1 for Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization
Figure 2 for Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization
Figure 3 for Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization
Figure 4 for Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization
Viaarxiv icon

Text-Video Retrieval with Global-Local Semantic Consistent Learning

Add code
May 21, 2024
Viaarxiv icon

RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception

Add code
May 17, 2024
Figure 1 for RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception
Figure 2 for RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception
Figure 3 for RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception
Figure 4 for RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception
Viaarxiv icon

EchoReel: Enhancing Action Generation of Existing Video Diffusion Models

Add code
Mar 18, 2024
Viaarxiv icon

CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model

Add code
Mar 13, 2024
Figure 1 for CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model
Figure 2 for CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model
Figure 3 for CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model
Figure 4 for CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model
Viaarxiv icon

Training-Free Semantic Video Composition via Pre-trained Diffusion Model

Add code
Jan 17, 2024
Figure 1 for Training-Free Semantic Video Composition via Pre-trained Diffusion Model
Figure 2 for Training-Free Semantic Video Composition via Pre-trained Diffusion Model
Figure 3 for Training-Free Semantic Video Composition via Pre-trained Diffusion Model
Figure 4 for Training-Free Semantic Video Composition via Pre-trained Diffusion Model
Viaarxiv icon

Context-based Transfer and Efficient Iterative Learning for Unbiased Scene Graph Generation

Add code
Dec 29, 2023
Viaarxiv icon

ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval

Add code
Dec 19, 2023
Figure 1 for ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval
Figure 2 for ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval
Figure 3 for ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval
Figure 4 for ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval
Viaarxiv icon

Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control

Add code
Dec 06, 2023
Viaarxiv icon

F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis

Add code
Dec 06, 2023
Viaarxiv icon