Picture for Zhiheng Li

Zhiheng Li

Efficient Scaling of Diffusion Transformers for Text-to-Image Generation

Add code
Dec 16, 2024
Figure 1 for Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Figure 2 for Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Figure 3 for Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Figure 4 for Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Viaarxiv icon

LOMA: Language-assisted Semantic Occupancy Network via Triplane Mamba

Add code
Dec 11, 2024
Viaarxiv icon

Will Large Language Models be a Panacea to Autonomous Driving?

Add code
Sep 24, 2024
Figure 1 for Will Large Language Models be a Panacea to Autonomous Driving?
Figure 2 for Will Large Language Models be a Panacea to Autonomous Driving?
Figure 3 for Will Large Language Models be a Panacea to Autonomous Driving?
Figure 4 for Will Large Language Models be a Panacea to Autonomous Driving?
Viaarxiv icon

Rhythmic Foley: A Framework For Seamless Audio-Visual Alignment In Video-to-Audio Synthesis

Add code
Sep 13, 2024
Figure 1 for Rhythmic Foley: A Framework For Seamless Audio-Visual Alignment In Video-to-Audio Synthesis
Figure 2 for Rhythmic Foley: A Framework For Seamless Audio-Visual Alignment In Video-to-Audio Synthesis
Figure 3 for Rhythmic Foley: A Framework For Seamless Audio-Visual Alignment In Video-to-Audio Synthesis
Figure 4 for Rhythmic Foley: A Framework For Seamless Audio-Visual Alignment In Video-to-Audio Synthesis
Viaarxiv icon

Trajectory Planning for Teleoperated Space Manipulators Using Deep Reinforcement Learning

Add code
Aug 10, 2024
Figure 1 for Trajectory Planning for Teleoperated Space Manipulators Using Deep Reinforcement Learning
Figure 2 for Trajectory Planning for Teleoperated Space Manipulators Using Deep Reinforcement Learning
Figure 3 for Trajectory Planning for Teleoperated Space Manipulators Using Deep Reinforcement Learning
Figure 4 for Trajectory Planning for Teleoperated Space Manipulators Using Deep Reinforcement Learning
Viaarxiv icon

Integrating Controllable Motion Skills from Demonstrations

Add code
Aug 06, 2024
Figure 1 for Integrating Controllable Motion Skills from Demonstrations
Figure 2 for Integrating Controllable Motion Skills from Demonstrations
Figure 3 for Integrating Controllable Motion Skills from Demonstrations
Figure 4 for Integrating Controllable Motion Skills from Demonstrations
Viaarxiv icon

StreamMOS: Streaming Moving Object Segmentation with Multi-View Perception and Dual-Span Memory

Add code
Jul 25, 2024
Figure 1 for StreamMOS: Streaming Moving Object Segmentation with Multi-View Perception and Dual-Span Memory
Figure 2 for StreamMOS: Streaming Moving Object Segmentation with Multi-View Perception and Dual-Span Memory
Figure 3 for StreamMOS: Streaming Moving Object Segmentation with Multi-View Perception and Dual-Span Memory
Figure 4 for StreamMOS: Streaming Moving Object Segmentation with Multi-View Perception and Dual-Span Memory
Viaarxiv icon

KiGRAS: Kinematic-Driven Generative Model for Realistic Agent Simulation

Add code
Jul 17, 2024
Viaarxiv icon

FlowTrack: Point-level Flow Network for 3D Single Object Tracking

Add code
Jul 02, 2024
Figure 1 for FlowTrack: Point-level Flow Network for 3D Single Object Tracking
Figure 2 for FlowTrack: Point-level Flow Network for 3D Single Object Tracking
Figure 3 for FlowTrack: Point-level Flow Network for 3D Single Object Tracking
Figure 4 for FlowTrack: Point-level Flow Network for 3D Single Object Tracking
Viaarxiv icon

StreamMOTP: Streaming and Unified Framework for Joint 3D Multi-Object Tracking and Trajectory Prediction

Add code
Jun 28, 2024
Viaarxiv icon