Picture for Yuan Ge

Yuan Ge

On the Emotion Understanding of Synthesized Speech

Add code
Mar 17, 2026
Viaarxiv icon

StyleBench: Evaluating Speech Language Models on Conversational Speaking Style Control

Add code
Mar 08, 2026
Viaarxiv icon

When Scaling Fails: Mitigating Audio Perception Decay of LALMs via Multi-Step Perception-Aware Reasoning

Add code
Feb 28, 2026
Viaarxiv icon

VisualActBench: Can VLMs See and Act like a Human?

Add code
Dec 10, 2025
Viaarxiv icon

SUBQRAG: sub-question driven dynamic graph rag

Add code
Oct 09, 2025
Viaarxiv icon

FLEXI: Benchmarking Full-duplex Human-LLM Speech Interaction

Add code
Sep 26, 2025
Viaarxiv icon

SageLM: A Multi-aspect and Explainable Large Language Model for Speech Judgement

Add code
Aug 28, 2025
Viaarxiv icon

Attention2Probability: Attention-Driven Terminology Probability Estimation for Robust Speech-to-Text System

Add code
Aug 26, 2025
Viaarxiv icon

A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation

Add code
Sep 24, 2024
Figure 1 for A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation
Figure 2 for A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation
Figure 3 for A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation
Figure 4 for A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation
Viaarxiv icon

NDP: Next Distribution Prediction as a More Broad Target

Add code
Aug 30, 2024
Figure 1 for NDP: Next Distribution Prediction as a More Broad Target
Figure 2 for NDP: Next Distribution Prediction as a More Broad Target
Figure 3 for NDP: Next Distribution Prediction as a More Broad Target
Figure 4 for NDP: Next Distribution Prediction as a More Broad Target
Viaarxiv icon