Picture for Zhaoxiang Zhang

Zhaoxiang Zhang

LayerAnimate: Layer-specific Control for Animation

Add code
Jan 14, 2025
Figure 1 for LayerAnimate: Layer-specific Control for Animation
Figure 2 for LayerAnimate: Layer-specific Control for Animation
Figure 3 for LayerAnimate: Layer-specific Control for Animation
Figure 4 for LayerAnimate: Layer-specific Control for Animation
Viaarxiv icon

DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers

Add code
Dec 24, 2024
Figure 1 for DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers
Figure 2 for DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers
Figure 3 for DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers
Figure 4 for DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers
Viaarxiv icon

FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes

Add code
Dec 04, 2024
Figure 1 for FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes
Figure 2 for FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes
Figure 3 for FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes
Figure 4 for FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes
Viaarxiv icon

FullStack Bench: Evaluating LLMs as Full Stack Coders

Add code
Dec 03, 2024
Figure 1 for FullStack Bench: Evaluating LLMs as Full Stack Coders
Figure 2 for FullStack Bench: Evaluating LLMs as Full Stack Coders
Figure 3 for FullStack Bench: Evaluating LLMs as Full Stack Coders
Figure 4 for FullStack Bench: Evaluating LLMs as Full Stack Coders
Viaarxiv icon

SimCMF: A Simple Cross-modal Fine-tuning Strategy from Vision Foundation Models to Any Imaging Modality

Add code
Nov 27, 2024
Figure 1 for SimCMF: A Simple Cross-modal Fine-tuning Strategy from Vision Foundation Models to Any Imaging Modality
Figure 2 for SimCMF: A Simple Cross-modal Fine-tuning Strategy from Vision Foundation Models to Any Imaging Modality
Figure 3 for SimCMF: A Simple Cross-modal Fine-tuning Strategy from Vision Foundation Models to Any Imaging Modality
Figure 4 for SimCMF: A Simple Cross-modal Fine-tuning Strategy from Vision Foundation Models to Any Imaging Modality
Viaarxiv icon

OOD-HOI: Text-Driven 3D Whole-Body Human-Object Interactions Generation Beyond Training Domains

Add code
Nov 27, 2024
Figure 1 for OOD-HOI: Text-Driven 3D Whole-Body Human-Object Interactions Generation Beyond Training Domains
Figure 2 for OOD-HOI: Text-Driven 3D Whole-Body Human-Object Interactions Generation Beyond Training Domains
Figure 3 for OOD-HOI: Text-Driven 3D Whole-Body Human-Object Interactions Generation Beyond Training Domains
Figure 4 for OOD-HOI: Text-Driven 3D Whole-Body Human-Object Interactions Generation Beyond Training Domains
Viaarxiv icon

Revisiting Marr in Face: The Building of 2D--2.5D--3D Representations in Deep Neural Networks

Add code
Nov 25, 2024
Figure 1 for Revisiting Marr in Face: The Building of 2D--2.5D--3D Representations in Deep Neural Networks
Figure 2 for Revisiting Marr in Face: The Building of 2D--2.5D--3D Representations in Deep Neural Networks
Figure 3 for Revisiting Marr in Face: The Building of 2D--2.5D--3D Representations in Deep Neural Networks
Figure 4 for Revisiting Marr in Face: The Building of 2D--2.5D--3D Representations in Deep Neural Networks
Viaarxiv icon

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Add code
Nov 07, 2024
Figure 1 for OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Figure 2 for OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Figure 3 for OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Figure 4 for OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Viaarxiv icon

VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization

Add code
Nov 03, 2024
Figure 1 for VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
Figure 2 for VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
Figure 3 for VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
Figure 4 for VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
Viaarxiv icon

CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes

Add code
Nov 01, 2024
Figure 1 for CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes
Figure 2 for CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes
Figure 3 for CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes
Figure 4 for CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes
Viaarxiv icon