Picture for Zhaoxiang Zhang

Zhaoxiang Zhang

Reconstructive Visual Instruction Tuning

Add code
Oct 12, 2024
Viaarxiv icon

MIO: A Foundation Model on Multimodal Tokens

Add code
Sep 26, 2024
Figure 1 for MIO: A Foundation Model on Multimodal Tokens
Figure 2 for MIO: A Foundation Model on Multimodal Tokens
Figure 3 for MIO: A Foundation Model on Multimodal Tokens
Figure 4 for MIO: A Foundation Model on Multimodal Tokens
Viaarxiv icon

HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Add code
Sep 24, 2024
Figure 1 for HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models
Figure 2 for HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models
Figure 3 for HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models
Figure 4 for HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models
Viaarxiv icon

SimMAT: Exploring Transferability from Vision Foundation Models to Any Image Modality

Add code
Sep 12, 2024
Figure 1 for SimMAT: Exploring Transferability from Vision Foundation Models to Any Image Modality
Figure 2 for SimMAT: Exploring Transferability from Vision Foundation Models to Any Image Modality
Figure 3 for SimMAT: Exploring Transferability from Vision Foundation Models to Any Image Modality
Figure 4 for SimMAT: Exploring Transferability from Vision Foundation Models to Any Image Modality
Viaarxiv icon

Enhancing Sound Source Localization via False Negative Elimination

Add code
Aug 29, 2024
Figure 1 for Enhancing Sound Source Localization via False Negative Elimination
Figure 2 for Enhancing Sound Source Localization via False Negative Elimination
Figure 3 for Enhancing Sound Source Localization via False Negative Elimination
Figure 4 for Enhancing Sound Source Localization via False Negative Elimination
Viaarxiv icon

CityX: Controllable Procedural Content Generation for Unbounded 3D Cities

Add code
Jul 29, 2024
Figure 1 for CityX: Controllable Procedural Content Generation for Unbounded 3D Cities
Figure 2 for CityX: Controllable Procedural Content Generation for Unbounded 3D Cities
Figure 3 for CityX: Controllable Procedural Content Generation for Unbounded 3D Cities
Figure 4 for CityX: Controllable Procedural Content Generation for Unbounded 3D Cities
Viaarxiv icon

General Geometry-aware Weakly Supervised 3D Object Detection

Add code
Jul 18, 2024
Figure 1 for General Geometry-aware Weakly Supervised 3D Object Detection
Figure 2 for General Geometry-aware Weakly Supervised 3D Object Detection
Figure 3 for General Geometry-aware Weakly Supervised 3D Object Detection
Figure 4 for General Geometry-aware Weakly Supervised 3D Object Detection
Viaarxiv icon

Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation

Add code
Jul 18, 2024
Figure 1 for Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
Figure 2 for Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
Viaarxiv icon

Monocular Occupancy Prediction for Scalable Indoor Scenes

Add code
Jul 16, 2024
Viaarxiv icon

Voxel Mamba: Group-Free State Space Models for Point Cloud based 3D Object Detection

Add code
Jun 18, 2024
Viaarxiv icon