Picture for Lu Yuan

Lu Yuan

Stephen

On Pre-training of Multimodal Language Models Customized for Chart Understanding

Add code
Jul 19, 2024
Viaarxiv icon

Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge

Add code
Jul 05, 2024
Viaarxiv icon

Efficient Modulation for Vision Networks

Add code
Mar 29, 2024
Figure 1 for Efficient Modulation for Vision Networks
Figure 2 for Efficient Modulation for Vision Networks
Figure 3 for Efficient Modulation for Vision Networks
Figure 4 for Efficient Modulation for Vision Networks
Viaarxiv icon

OmniVid: A Generative Framework for Universal Video Understanding

Add code
Mar 26, 2024
Figure 1 for OmniVid: A Generative Framework for Universal Video Understanding
Figure 2 for OmniVid: A Generative Framework for Universal Video Understanding
Figure 3 for OmniVid: A Generative Framework for Universal Video Understanding
Figure 4 for OmniVid: A Generative Framework for Universal Video Understanding
Viaarxiv icon

Generative Enhancement for 3D Medical Images

Add code
Mar 19, 2024
Figure 1 for Generative Enhancement for 3D Medical Images
Figure 2 for Generative Enhancement for 3D Medical Images
Figure 3 for Generative Enhancement for 3D Medical Images
Figure 4 for Generative Enhancement for 3D Medical Images
Viaarxiv icon

Block and Detail: Scaffolding Sketch-to-Image Generation

Add code
Feb 28, 2024
Figure 1 for Block and Detail: Scaffolding Sketch-to-Image Generation
Figure 2 for Block and Detail: Scaffolding Sketch-to-Image Generation
Figure 3 for Block and Detail: Scaffolding Sketch-to-Image Generation
Figure 4 for Block and Detail: Scaffolding Sketch-to-Image Generation
Viaarxiv icon

Knowledge Graph Driven UAV Cognitive Semantic Communication Systems for Efficient Object Detection

Add code
Jan 25, 2024
Viaarxiv icon

iFusion: Inverting Diffusion for Pose-Free Reconstruction from Sparse Views

Add code
Dec 28, 2023
Viaarxiv icon

Learning Subject-Aware Cropping by Outpainting Professional Photos

Add code
Dec 19, 2023
Figure 1 for Learning Subject-Aware Cropping by Outpainting Professional Photos
Figure 2 for Learning Subject-Aware Cropping by Outpainting Professional Photos
Figure 3 for Learning Subject-Aware Cropping by Outpainting Professional Photos
Figure 4 for Learning Subject-Aware Cropping by Outpainting Professional Photos
Viaarxiv icon

Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models

Add code
Nov 28, 2023
Figure 1 for Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models
Figure 2 for Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models
Figure 3 for Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models
Figure 4 for Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models
Viaarxiv icon