Picture for Shanghang Zhang

Shanghang Zhang

CrayonRobo: Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation

Add code
May 04, 2025
Viaarxiv icon

Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion

Add code
May 03, 2025
Figure 1 for Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
Figure 2 for Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
Figure 3 for Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
Figure 4 for Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
Viaarxiv icon

ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance

Add code
Apr 23, 2025
Figure 1 for ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
Figure 2 for ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
Figure 3 for ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
Figure 4 for ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
Viaarxiv icon

EmbodiedOcc++: Boosting Embodied 3D Occupancy Prediction with Plane Regularization and Uncertainty Sampler

Add code
Apr 13, 2025
Viaarxiv icon

Segment Any Motion in Videos

Add code
Mar 28, 2025
Viaarxiv icon

Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning

Add code
Mar 27, 2025
Viaarxiv icon

MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation

Add code
Mar 26, 2025
Viaarxiv icon

EmpathyAgent: Can Embodied Agents Conduct Empathetic Actions?

Add code
Mar 19, 2025
Viaarxiv icon

HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model

Add code
Mar 13, 2025
Viaarxiv icon

RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete

Add code
Feb 28, 2025
Viaarxiv icon