Picture for Yuan Zhang

Yuan Zhang

TimeSearch-R: Adaptive Temporal Search for Long-Form Video Understanding via Self-Verification Reinforcement Learning

Add code
Nov 07, 2025
Viaarxiv icon

OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation

Add code
Aug 26, 2025
Figure 1 for OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation
Figure 2 for OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation
Figure 3 for OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation
Figure 4 for OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation
Viaarxiv icon

THAT: Token-wise High-frequency Augmentation Transformer for Hyperspectral Pansharpening

Add code
Aug 11, 2025
Viaarxiv icon

DiffCap: Diffusion-based Real-time Human Motion Capture using Sparse IMUs and a Monocular Camera

Add code
Aug 08, 2025
Figure 1 for DiffCap: Diffusion-based Real-time Human Motion Capture using Sparse IMUs and a Monocular Camera
Figure 2 for DiffCap: Diffusion-based Real-time Human Motion Capture using Sparse IMUs and a Monocular Camera
Figure 3 for DiffCap: Diffusion-based Real-time Human Motion Capture using Sparse IMUs and a Monocular Camera
Figure 4 for DiffCap: Diffusion-based Real-time Human Motion Capture using Sparse IMUs and a Monocular Camera
Viaarxiv icon

MedReadCtrl: Personalizing medical text generation with readability-controlled instruction learning

Add code
Jul 10, 2025
Viaarxiv icon

Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs

Add code
Jun 12, 2025
Figure 1 for Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs
Figure 2 for Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs
Figure 3 for Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs
Figure 4 for Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs
Viaarxiv icon

Fast ECoT: Efficient Embodied Chain-of-Thought via Thoughts Reuse

Add code
Jun 09, 2025
Viaarxiv icon

PathFL: Multi-Alignment Federated Learning for Pathology Image Segmentation

Add code
May 28, 2025
Viaarxiv icon

Faithful Group Shapley Value

Add code
May 25, 2025
Viaarxiv icon

One-Step Diffusion-Based Image Compression with Semantic Distillation

Add code
May 22, 2025
Viaarxiv icon