Picture for Yu-Wing Tai

Yu-Wing Tai

Tencent

SmartAvatar: Text- and Image-Guided Human Avatar Generation with VLM AI Agents

Add code
Jun 05, 2025
Figure 1 for SmartAvatar: Text- and Image-Guided Human Avatar Generation with VLM AI Agents
Figure 2 for SmartAvatar: Text- and Image-Guided Human Avatar Generation with VLM AI Agents
Figure 3 for SmartAvatar: Text- and Image-Guided Human Avatar Generation with VLM AI Agents
Figure 4 for SmartAvatar: Text- and Image-Guided Human Avatar Generation with VLM AI Agents
Viaarxiv icon

MA-RAG: Multi-Agent Retrieval-Augmented Generation via Collaborative Chain-of-Thought Reasoning

Add code
May 26, 2025
Viaarxiv icon

Agentic 3D Scene Generation with Spatially Contextualized VLMs

Add code
May 26, 2025
Viaarxiv icon

ThinkVideo: High-Quality Reasoning Video Segmentation with Chain of Thoughts

Add code
May 24, 2025
Viaarxiv icon

FusionSegReID: Advancing Person Re-Identification with Multimodal Retrieval and Precise Segmentation

Add code
Mar 27, 2025
Figure 1 for FusionSegReID: Advancing Person Re-Identification with Multimodal Retrieval and Precise Segmentation
Figure 2 for FusionSegReID: Advancing Person Re-Identification with Multimodal Retrieval and Precise Segmentation
Figure 3 for FusionSegReID: Advancing Person Re-Identification with Multimodal Retrieval and Precise Segmentation
Figure 4 for FusionSegReID: Advancing Person Re-Identification with Multimodal Retrieval and Precise Segmentation
Viaarxiv icon

Multimodal Generation of Animatable 3D Human Models with AvatarForge

Add code
Mar 11, 2025
Viaarxiv icon

Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts

Add code
Mar 10, 2025
Viaarxiv icon

ReelWave: A Multi-Agent Framework Toward Professional Movie Sound Generation

Add code
Mar 10, 2025
Figure 1 for ReelWave: A Multi-Agent Framework Toward Professional Movie Sound Generation
Figure 2 for ReelWave: A Multi-Agent Framework Toward Professional Movie Sound Generation
Figure 3 for ReelWave: A Multi-Agent Framework Toward Professional Movie Sound Generation
Figure 4 for ReelWave: A Multi-Agent Framework Toward Professional Movie Sound Generation
Viaarxiv icon

Dynamic Path Navigation for Motion Agents with LLM Reasoning

Add code
Mar 10, 2025
Figure 1 for Dynamic Path Navigation for Motion Agents with LLM Reasoning
Figure 2 for Dynamic Path Navigation for Motion Agents with LLM Reasoning
Figure 3 for Dynamic Path Navigation for Motion Agents with LLM Reasoning
Figure 4 for Dynamic Path Navigation for Motion Agents with LLM Reasoning
Viaarxiv icon

WorldCraft: Photo-Realistic 3D World Creation and Customization via LLM Agents

Add code
Feb 21, 2025
Viaarxiv icon