Picture for Yifei Huang

Yifei Huang

CaST-Bench: Benchmarking Causal Chain-Grounded Spatio-Temporal Reasoning for Video Question Answering

Add code
May 22, 2026
Viaarxiv icon

Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?

Add code
May 21, 2026
Viaarxiv icon

SocialDirector: Training-Free Social Interaction Control for Multi-Person Video Generation

Add code
May 11, 2026
Viaarxiv icon

Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning

Add code
Mar 24, 2026
Viaarxiv icon

Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline

Add code
Mar 05, 2026
Viaarxiv icon

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

Towards Interactive Intelligence for Digital Humans

Add code
Dec 15, 2025
Viaarxiv icon

The N-Body Problem: Parallel Execution from Single-Person Egocentric Video

Add code
Dec 12, 2025
Viaarxiv icon

UniLS: End-to-End Audio-Driven Avatars for Unified Listening and Speaking

Add code
Dec 10, 2025
Viaarxiv icon

Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels

Add code
Dec 08, 2025
Figure 1 for Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels
Figure 2 for Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels
Figure 3 for Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels
Figure 4 for Living the Novel: A System for Generating Self-Training Timeline-Aware Conversational Agents from Novels
Viaarxiv icon