Picture for Baoyuan Wang

Baoyuan Wang

Boosting MLLM Spatial Reasoning with Geometrically Referenced 3D Scene Representations

Add code
Mar 09, 2026
Viaarxiv icon

Towards Objectively Benchmarking Social Intelligence for Language Agents at Action Level

Add code
Apr 08, 2024
Viaarxiv icon

Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer

Add code
Mar 20, 2024
Figure 1 for Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer
Figure 2 for Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer
Figure 3 for Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer
Figure 4 for Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer
Viaarxiv icon

Subobject-level Image Tokenization

Add code
Feb 22, 2024
Figure 1 for Subobject-level Image Tokenization
Figure 2 for Subobject-level Image Tokenization
Figure 3 for Subobject-level Image Tokenization
Figure 4 for Subobject-level Image Tokenization
Viaarxiv icon

Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations

Add code
Feb 19, 2024
Figure 1 for Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations
Figure 2 for Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations
Figure 3 for Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations
Figure 4 for Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations
Viaarxiv icon

GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance

Add code
Dec 12, 2023
Figure 1 for GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance
Figure 2 for GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance
Figure 3 for GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance
Figure 4 for GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance
Viaarxiv icon

PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns

Add code
Dec 07, 2023
Figure 1 for PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns
Figure 2 for PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns
Figure 3 for PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns
Figure 4 for PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns
Viaarxiv icon

AgentAvatar: Disentangling Planning, Driving and Rendering for Photorealistic Avatar Agents

Add code
Dec 04, 2023
Figure 1 for AgentAvatar: Disentangling Planning, Driving and Rendering for Photorealistic Avatar Agents
Figure 2 for AgentAvatar: Disentangling Planning, Driving and Rendering for Photorealistic Avatar Agents
Figure 3 for AgentAvatar: Disentangling Planning, Driving and Rendering for Photorealistic Avatar Agents
Figure 4 for AgentAvatar: Disentangling Planning, Driving and Rendering for Photorealistic Avatar Agents
Viaarxiv icon

Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data

Add code
Nov 30, 2023
Figure 1 for Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data
Figure 2 for Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data
Figure 3 for Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data
Figure 4 for Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data
Viaarxiv icon

A Unified Framework for Multimodal, Multi-Part Human Motion Synthesis

Add code
Nov 28, 2023
Viaarxiv icon