Picture for Yifei Huang

Yifei Huang

SocialDirector: Training-Free Social Interaction Control for Multi-Person Video Generation

Add code
May 11, 2026
Viaarxiv icon

Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning

Add code
Mar 24, 2026
Viaarxiv icon

Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline

Add code
Mar 05, 2026
Viaarxiv icon

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

Towards Interactive Intelligence for Digital Humans

Add code
Dec 15, 2025
Viaarxiv icon

The N-Body Problem: Parallel Execution from Single-Person Egocentric Video

Add code
Dec 12, 2025
Viaarxiv icon

UniLS: End-to-End Audio-Driven Avatars for Unified Listening and Speaking

Add code
Dec 10, 2025
Viaarxiv icon

Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels

Add code
Dec 08, 2025
Figure 1 for Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels
Figure 2 for Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels
Figure 3 for Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels
Figure 4 for Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels
Viaarxiv icon

Can MLLMs Read the Room? A Multimodal Benchmark for Verifying Truthfulness in Multi-Party Social Interactions

Add code
Oct 31, 2025
Viaarxiv icon

Solving the Hubbard model with Neural Quantum States

Add code
Jul 03, 2025
Viaarxiv icon