Picture for Lianli Gao

Lianli Gao

Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization

May 24, 2024
Viaarxiv icon

Text-Video Retrieval with Global-Local Semantic Consistent Learning

Add code
May 21, 2024
Figure 1 for Text-Video Retrieval with Global-Local Semantic Consistent Learning
Figure 2 for Text-Video Retrieval with Global-Local Semantic Consistent Learning
Figure 3 for Text-Video Retrieval with Global-Local Semantic Consistent Learning
Figure 4 for Text-Video Retrieval with Global-Local Semantic Consistent Learning
Viaarxiv icon

RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception

Add code
May 17, 2024
Viaarxiv icon

EchoReel: Enhancing Action Generation of Existing Video Diffusion Models

Add code
Mar 18, 2024
Figure 1 for EchoReel: Enhancing Action Generation of Existing Video Diffusion Models
Viaarxiv icon

CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model

Add code
Mar 13, 2024
Figure 1 for CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model
Figure 2 for CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model
Figure 3 for CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model
Figure 4 for CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model
Viaarxiv icon

Training-Free Semantic Video Composition via Pre-trained Diffusion Model

Jan 17, 2024
Viaarxiv icon

Context-based Transfer and Efficient Iterative Learning for Unbiased Scene Graph Generation

Dec 29, 2023
Viaarxiv icon

ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval

Add code
Dec 19, 2023
Viaarxiv icon

F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis

Dec 06, 2023
Figure 1 for F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis
Figure 2 for F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis
Figure 3 for F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis
Figure 4 for F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis
Viaarxiv icon

Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control

Dec 06, 2023
Figure 1 for Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control
Figure 2 for Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control
Figure 3 for Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control
Figure 4 for Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control
Viaarxiv icon