Picture for Zhiyuan Zhang

Zhiyuan Zhang

I2V3D: Controllable image-to-video generation with 3D guidance

Add code
Mar 12, 2025
Figure 1 for I2V3D: Controllable image-to-video generation with 3D guidance
Figure 2 for I2V3D: Controllable image-to-video generation with 3D guidance
Figure 3 for I2V3D: Controllable image-to-video generation with 3D guidance
Figure 4 for I2V3D: Controllable image-to-video generation with 3D guidance
Viaarxiv icon

DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving

Add code
Mar 07, 2025
Viaarxiv icon

BHViT: Binarized Hybrid Vision Transformer

Add code
Mar 05, 2025
Figure 1 for BHViT: Binarized Hybrid Vision Transformer
Figure 2 for BHViT: Binarized Hybrid Vision Transformer
Figure 3 for BHViT: Binarized Hybrid Vision Transformer
Figure 4 for BHViT: Binarized Hybrid Vision Transformer
Viaarxiv icon

Bench2Drive-R: Turning Real World Data into Reactive Closed-Loop Autonomous Driving Benchmark by Generative Model

Add code
Dec 11, 2024
Figure 1 for Bench2Drive-R: Turning Real World Data into Reactive Closed-Loop Autonomous Driving Benchmark by Generative Model
Figure 2 for Bench2Drive-R: Turning Real World Data into Reactive Closed-Loop Autonomous Driving Benchmark by Generative Model
Figure 3 for Bench2Drive-R: Turning Real World Data into Reactive Closed-Loop Autonomous Driving Benchmark by Generative Model
Figure 4 for Bench2Drive-R: Turning Real World Data into Reactive Closed-Loop Autonomous Driving Benchmark by Generative Model
Viaarxiv icon

SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing

Add code
Oct 15, 2024
Figure 1 for SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing
Figure 2 for SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing
Figure 3 for SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing
Figure 4 for SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing
Viaarxiv icon

Residual Descent Differential Dynamic Game (RD3G) -- A Fast Newton Solver for Constrained General Sum Games

Add code
Sep 18, 2024
Viaarxiv icon

RISurConv: Rotation Invariant Surface Attention-Augmented Convolutions for 3D Point Cloud Classification and Segmentation

Add code
Aug 12, 2024
Figure 1 for RISurConv: Rotation Invariant Surface Attention-Augmented Convolutions for 3D Point Cloud Classification and Segmentation
Figure 2 for RISurConv: Rotation Invariant Surface Attention-Augmented Convolutions for 3D Point Cloud Classification and Segmentation
Figure 3 for RISurConv: Rotation Invariant Surface Attention-Augmented Convolutions for 3D Point Cloud Classification and Segmentation
Figure 4 for RISurConv: Rotation Invariant Surface Attention-Augmented Convolutions for 3D Point Cloud Classification and Segmentation
Viaarxiv icon

Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving

Add code
Jun 06, 2024
Figure 1 for Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving
Figure 2 for Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving
Figure 3 for Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving
Figure 4 for Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving
Viaarxiv icon

Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion

Add code
May 18, 2024
Figure 1 for Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion
Figure 2 for Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion
Figure 3 for Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion
Figure 4 for Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion
Viaarxiv icon

Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V

Add code
Apr 16, 2024
Figure 1 for Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
Figure 2 for Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
Figure 3 for Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
Figure 4 for Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
Viaarxiv icon