Picture for Hao Li

Hao Li

Jack

Efficient Scaling of Diffusion Transformers for Text-to-Image Generation

Add code
Dec 16, 2024
Figure 1 for Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Figure 2 for Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Figure 3 for Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Figure 4 for Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Viaarxiv icon

SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding

Add code
Dec 12, 2024
Figure 1 for SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding
Figure 2 for SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding
Figure 3 for SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding
Figure 4 for SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding
Viaarxiv icon

Unified Vertex Motion Estimation for Integrated Video Stabilization and Stitching in Tractor-Trailer Wheeled Robots

Add code
Dec 10, 2024
Figure 1 for Unified Vertex Motion Estimation for Integrated Video Stabilization and Stitching in Tractor-Trailer Wheeled Robots
Figure 2 for Unified Vertex Motion Estimation for Integrated Video Stabilization and Stitching in Tractor-Trailer Wheeled Robots
Figure 3 for Unified Vertex Motion Estimation for Integrated Video Stabilization and Stitching in Tractor-Trailer Wheeled Robots
Figure 4 for Unified Vertex Motion Estimation for Integrated Video Stabilization and Stitching in Tractor-Trailer Wheeled Robots
Viaarxiv icon

Political Actor Agent: Simulating Legislative System for Roll Call Votes Prediction with Large Language Models

Add code
Dec 10, 2024
Figure 1 for Political Actor Agent: Simulating Legislative System for Roll Call Votes Prediction with Large Language Models
Figure 2 for Political Actor Agent: Simulating Legislative System for Roll Call Votes Prediction with Large Language Models
Figure 3 for Political Actor Agent: Simulating Legislative System for Roll Call Votes Prediction with Large Language Models
Figure 4 for Political Actor Agent: Simulating Legislative System for Roll Call Votes Prediction with Large Language Models
Viaarxiv icon

Radiant: Large-scale 3D Gaussian Rendering based on Hierarchical Framework

Add code
Dec 07, 2024
Figure 1 for Radiant: Large-scale 3D Gaussian Rendering based on Hierarchical Framework
Figure 2 for Radiant: Large-scale 3D Gaussian Rendering based on Hierarchical Framework
Figure 3 for Radiant: Large-scale 3D Gaussian Rendering based on Hierarchical Framework
Figure 4 for Radiant: Large-scale 3D Gaussian Rendering based on Hierarchical Framework
Viaarxiv icon

LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment

Add code
Dec 06, 2024
Figure 1 for LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment
Figure 2 for LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment
Figure 3 for LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment
Figure 4 for LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment
Viaarxiv icon

LiDAR SLAMMOT based on Confidence-guided Data Association

Add code
Dec 02, 2024
Figure 1 for LiDAR SLAMMOT based on Confidence-guided Data Association
Figure 2 for LiDAR SLAMMOT based on Confidence-guided Data Association
Figure 3 for LiDAR SLAMMOT based on Confidence-guided Data Association
Figure 4 for LiDAR SLAMMOT based on Confidence-guided Data Association
Viaarxiv icon

FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration

Add code
Dec 02, 2024
Figure 1 for FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration
Figure 2 for FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration
Figure 3 for FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration
Figure 4 for FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration
Viaarxiv icon

VLSBench: Unveiling Visual Leakage in Multimodal Safety

Add code
Nov 29, 2024
Viaarxiv icon

Visual SLAMMOT Considering Multiple Motion Models

Add code
Nov 28, 2024
Viaarxiv icon