Picture for Han Zhao

Han Zhao

When LRP Diverges from Leave-One-Out in Transformers

Add code
Oct 21, 2025
Viaarxiv icon

MELA-TTS: Joint transformer-diffusion model with representation alignment for speech synthesis

Add code
Sep 18, 2025
Viaarxiv icon

FunAudio-ASR Technical Report

Add code
Sep 15, 2025
Figure 1 for FunAudio-ASR Technical Report
Figure 2 for FunAudio-ASR Technical Report
Figure 3 for FunAudio-ASR Technical Report
Figure 4 for FunAudio-ASR Technical Report
Viaarxiv icon

Boosting Embodied AI Agents through Perception-Generation Disaggregation and Asynchronous Pipeline Execution

Add code
Sep 11, 2025
Viaarxiv icon

VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model

Add code
Sep 11, 2025
Viaarxiv icon

Omne-R1: Learning to Reason with Memory for Multi-hop Question Answering

Add code
Aug 24, 2025
Viaarxiv icon

ReconVLA: Reconstructive Vision-Language-Action Model as Effective Robot Perceiver

Add code
Aug 14, 2025
Viaarxiv icon

CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding

Add code
Jun 16, 2025
Viaarxiv icon

RationalVLA: A Rational Vision-Language-Action Model with Dual System

Add code
Jun 12, 2025
Viaarxiv icon

Moment Alignment: Unifying Gradient and Hessian Matching for Domain Generalization

Add code
Jun 09, 2025
Viaarxiv icon