Picture for Shaoli Huang

Shaoli Huang

Democratizing High-Fidelity Co-Speech Gesture Video Generation

Add code
Jul 09, 2025
Viaarxiv icon

Bilateral Collaboration with Large Vision-Language Models for Open Vocabulary Human-Object Interaction Detection

Add code
Jul 09, 2025
Viaarxiv icon

Guiding Human-Object Interactions with Rich Geometry and Relations

Add code
Mar 26, 2025
Viaarxiv icon

HoloGest: Decoupled Diffusion and Motion Priors for Generating Holisticly Expressive Co-speech Gestures

Add code
Mar 17, 2025
Figure 1 for HoloGest: Decoupled Diffusion and Motion Priors for Generating Holisticly Expressive Co-speech Gestures
Figure 2 for HoloGest: Decoupled Diffusion and Motion Priors for Generating Holisticly Expressive Co-speech Gestures
Figure 3 for HoloGest: Decoupled Diffusion and Motion Priors for Generating Holisticly Expressive Co-speech Gestures
Figure 4 for HoloGest: Decoupled Diffusion and Motion Priors for Generating Holisticly Expressive Co-speech Gestures
Viaarxiv icon

RopeTP: Global Human Motion Recovery via Integrating Robust Pose Estimation with Diffusion Trajectory Prior

Add code
Oct 27, 2024
Viaarxiv icon

Conditional GAN for Enhancing Diffusion Models in Efficient and Authentic Global Gesture Generation from Audios

Add code
Oct 27, 2024
Figure 1 for Conditional GAN for Enhancing Diffusion Models in Efficient and Authentic Global Gesture Generation from Audios
Figure 2 for Conditional GAN for Enhancing Diffusion Models in Efficient and Authentic Global Gesture Generation from Audios
Figure 3 for Conditional GAN for Enhancing Diffusion Models in Efficient and Authentic Global Gesture Generation from Audios
Figure 4 for Conditional GAN for Enhancing Diffusion Models in Efficient and Authentic Global Gesture Generation from Audios
Viaarxiv icon

ExpGest: Expressive Speaker Generation Using Diffusion Model and Hybrid Audio-Text Guidance

Add code
Oct 12, 2024
Figure 1 for ExpGest: Expressive Speaker Generation Using Diffusion Model and Hybrid Audio-Text Guidance
Figure 2 for ExpGest: Expressive Speaker Generation Using Diffusion Model and Hybrid Audio-Text Guidance
Figure 3 for ExpGest: Expressive Speaker Generation Using Diffusion Model and Hybrid Audio-Text Guidance
Figure 4 for ExpGest: Expressive Speaker Generation Using Diffusion Model and Hybrid Audio-Text Guidance
Viaarxiv icon

ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model

Add code
Oct 09, 2024
Figure 1 for ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model
Figure 2 for ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model
Figure 3 for ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model
Figure 4 for ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model
Viaarxiv icon

CT4D: Consistent Text-to-4D Generation with Animatable Meshes

Add code
Aug 15, 2024
Figure 1 for CT4D: Consistent Text-to-4D Generation with Animatable Meshes
Figure 2 for CT4D: Consistent Text-to-4D Generation with Animatable Meshes
Figure 3 for CT4D: Consistent Text-to-4D Generation with Animatable Meshes
Figure 4 for CT4D: Consistent Text-to-4D Generation with Animatable Meshes
Viaarxiv icon

GrootVL: Tree Topology is All You Need in State Space Model

Add code
Jun 04, 2024
Figure 1 for GrootVL: Tree Topology is All You Need in State Space Model
Figure 2 for GrootVL: Tree Topology is All You Need in State Space Model
Figure 3 for GrootVL: Tree Topology is All You Need in State Space Model
Figure 4 for GrootVL: Tree Topology is All You Need in State Space Model
Viaarxiv icon