Picture for Yuan Zhang

Yuan Zhang

JoyVoice: Long-Context Conditioning for Anthropomorphic Multi-Speaker Conversational Synthesis

Add code
Dec 22, 2025
Viaarxiv icon

Translating Informal Proofs into Formal Proofs Using a Chain of States

Add code
Dec 12, 2025
Figure 1 for Translating Informal Proofs into Formal Proofs Using a Chain of States
Figure 2 for Translating Informal Proofs into Formal Proofs Using a Chain of States
Figure 3 for Translating Informal Proofs into Formal Proofs Using a Chain of States
Figure 4 for Translating Informal Proofs into Formal Proofs Using a Chain of States
Viaarxiv icon

Single-step Diffusion-based Video Coding with Semantic-Temporal Guidance

Add code
Dec 08, 2025
Viaarxiv icon

ActVAR: Activating Mixtures of Weights and Tokens for Efficient Visual Autoregressive Generation

Add code
Nov 17, 2025
Viaarxiv icon

TimeSearch-R: Adaptive Temporal Search for Long-Form Video Understanding via Self-Verification Reinforcement Learning

Add code
Nov 07, 2025
Viaarxiv icon

iFlyBot-VLM Technical Report

Add code
Nov 07, 2025
Figure 1 for iFlyBot-VLM Technical Report
Figure 2 for iFlyBot-VLM Technical Report
Figure 3 for iFlyBot-VLM Technical Report
Figure 4 for iFlyBot-VLM Technical Report
Viaarxiv icon

OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation

Add code
Aug 26, 2025
Figure 1 for OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation
Figure 2 for OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation
Figure 3 for OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation
Figure 4 for OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation
Viaarxiv icon

THAT: Token-wise High-frequency Augmentation Transformer for Hyperspectral Pansharpening

Add code
Aug 11, 2025
Viaarxiv icon

DiffCap: Diffusion-based Real-time Human Motion Capture using Sparse IMUs and a Monocular Camera

Add code
Aug 08, 2025
Figure 1 for DiffCap: Diffusion-based Real-time Human Motion Capture using Sparse IMUs and a Monocular Camera
Figure 2 for DiffCap: Diffusion-based Real-time Human Motion Capture using Sparse IMUs and a Monocular Camera
Figure 3 for DiffCap: Diffusion-based Real-time Human Motion Capture using Sparse IMUs and a Monocular Camera
Figure 4 for DiffCap: Diffusion-based Real-time Human Motion Capture using Sparse IMUs and a Monocular Camera
Viaarxiv icon

MedReadCtrl: Personalizing medical text generation with readability-controlled instruction learning

Add code
Jul 10, 2025
Viaarxiv icon