Picture for Shuyang Sun

Shuyang Sun

Efficiently Reconstructing Dynamic Scenes One D4RT at a Time

Add code
Dec 10, 2025
Figure 1 for Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
Figure 2 for Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
Figure 3 for Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
Figure 4 for Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
Viaarxiv icon

Robot Learning from a Physical World Model

Add code
Nov 10, 2025
Figure 1 for Robot Learning from a Physical World Model
Figure 2 for Robot Learning from a Physical World Model
Figure 3 for Robot Learning from a Physical World Model
Figure 4 for Robot Learning from a Physical World Model
Viaarxiv icon

CyberV: Cybernetics for Test-time Scaling in Video Understanding

Add code
Jun 09, 2025
Figure 1 for CyberV: Cybernetics for Test-time Scaling in Video Understanding
Figure 2 for CyberV: Cybernetics for Test-time Scaling in Video Understanding
Figure 3 for CyberV: Cybernetics for Test-time Scaling in Video Understanding
Figure 4 for CyberV: Cybernetics for Test-time Scaling in Video Understanding
Viaarxiv icon

Diffusion Models Need Visual Priors for Image Generation

Add code
Oct 11, 2024
Figure 1 for Diffusion Models Need Visual Priors for Image Generation
Figure 2 for Diffusion Models Need Visual Priors for Image Generation
Figure 3 for Diffusion Models Need Visual Priors for Image Generation
Figure 4 for Diffusion Models Need Visual Priors for Image Generation
Viaarxiv icon

DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer

Add code
Sep 12, 2024
Figure 1 for DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer
Figure 2 for DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer
Figure 3 for DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer
Figure 4 for DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer
Viaarxiv icon

kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies

Add code
Apr 15, 2024
Figure 1 for kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies
Figure 2 for kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies
Figure 3 for kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies
Figure 4 for kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies
Viaarxiv icon

SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model

Add code
Mar 05, 2024
Figure 1 for SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model
Figure 2 for SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model
Figure 3 for SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model
Figure 4 for SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model
Viaarxiv icon

RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model

Add code
Feb 16, 2024
Figure 1 for RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model
Figure 2 for RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model
Figure 3 for RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model
Figure 4 for RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model
Viaarxiv icon

CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor

Add code
Dec 21, 2023
Figure 1 for CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor
Figure 2 for CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor
Figure 3 for CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor
Figure 4 for CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor
Viaarxiv icon

Real-Fake: Effective Training Data Synthesis Through Distribution Matching

Add code
Oct 16, 2023
Figure 1 for Real-Fake: Effective Training Data Synthesis Through Distribution Matching
Figure 2 for Real-Fake: Effective Training Data Synthesis Through Distribution Matching
Figure 3 for Real-Fake: Effective Training Data Synthesis Through Distribution Matching
Figure 4 for Real-Fake: Effective Training Data Synthesis Through Distribution Matching
Viaarxiv icon