Picture for Kaicheng Yu

Kaicheng Yu

AutoLab, Westlake University

Baichuan-Omni Technical Report

Add code
Oct 11, 2024
Figure 1 for Baichuan-Omni Technical Report
Figure 2 for Baichuan-Omni Technical Report
Figure 3 for Baichuan-Omni Technical Report
Figure 4 for Baichuan-Omni Technical Report
Viaarxiv icon

DiVE: DiT-based Video Generation with Enhanced Control

Add code
Sep 03, 2024
Figure 1 for DiVE: DiT-based Video Generation with Enhanced Control
Figure 2 for DiVE: DiT-based Video Generation with Enhanced Control
Figure 3 for DiVE: DiT-based Video Generation with Enhanced Control
Figure 4 for DiVE: DiT-based Video Generation with Enhanced Control
Viaarxiv icon

BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical Science

Add code
Jun 29, 2024
Figure 1 for BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical Science
Figure 2 for BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical Science
Figure 3 for BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical Science
Figure 4 for BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical Science
Viaarxiv icon

M3GIA: A Cognition Inspired Multilingual and Multimodal General Intelligence Ability Benchmark

Add code
Jun 08, 2024
Viaarxiv icon

Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation

Add code
Jun 03, 2024
Viaarxiv icon

AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis

Add code
Feb 27, 2024
Figure 1 for AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis
Figure 2 for AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis
Figure 3 for AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis
Figure 4 for AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis
Viaarxiv icon

OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection

Add code
Dec 12, 2023
Viaarxiv icon

BEVHeight++: Toward Robust Visual Centric 3D Object Detection

Add code
Sep 28, 2023
Figure 1 for BEVHeight++: Toward Robust Visual Centric 3D Object Detection
Figure 2 for BEVHeight++: Toward Robust Visual Centric 3D Object Detection
Figure 3 for BEVHeight++: Toward Robust Visual Centric 3D Object Detection
Figure 4 for BEVHeight++: Toward Robust Visual Centric 3D Object Detection
Viaarxiv icon

FusionFormer: A Multi-sensory Fusion in Bird's-Eye-View and Temporal Consistent Transformer for 3D Objection

Add code
Sep 11, 2023
Figure 1 for FusionFormer: A Multi-sensory Fusion in Bird's-Eye-View and Temporal Consistent Transformer for 3D Objection
Figure 2 for FusionFormer: A Multi-sensory Fusion in Bird's-Eye-View and Temporal Consistent Transformer for 3D Objection
Figure 3 for FusionFormer: A Multi-sensory Fusion in Bird's-Eye-View and Temporal Consistent Transformer for 3D Objection
Figure 4 for FusionFormer: A Multi-sensory Fusion in Bird's-Eye-View and Temporal Consistent Transformer for 3D Objection
Viaarxiv icon

Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training

Add code
Aug 18, 2023
Figure 1 for Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training
Figure 2 for Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training
Figure 3 for Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training
Figure 4 for Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training
Viaarxiv icon