Picture for Yuguang Yang

Yuguang Yang

CL-CLIP: CLIP-Based Continual Learning Framework with Cost-Volume Category Decoupling for Object Detection

Add code
Jun 05, 2026
Viaarxiv icon

AnyAudio-Judge: A Dynamic Rubric-Based Benchmark and Evaluator for Audio Instruction Following

Add code
Jun 02, 2026
Viaarxiv icon

CLOVER: Closed-Loop Value Estimation \& Ranking for End-to-End Autonomous Driving Planning

Add code
May 14, 2026
Viaarxiv icon

Learning domain-invariant features through channel-level sparsification for Out-Of Distribution Generalization

Add code
Mar 26, 2026
Viaarxiv icon

AMLRIS: Alignment-aware Masked Learning for Referring Image Segmentation

Add code
Feb 26, 2026
Viaarxiv icon

From Representational Complementarity to Dual Systems: Synergizing VLM and Vision-Only Backbones for End-to-End Driving

Add code
Feb 11, 2026
Viaarxiv icon

OpenAI GPT-5 System Card

Add code
Dec 19, 2025
Viaarxiv icon

Squeeze10-LLM: Squeezing LLMs' Weights by 10 Times via a Staged Mixed-Precision Quantization Method

Add code
Jul 24, 2025
Viaarxiv icon

S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamlessly Speech-Text Alignment and Streaming Speech Decoder

Add code
Jun 16, 2025
Figure 1 for S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamlessly Speech-Text Alignment and Streaming Speech Decoder
Figure 2 for S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamlessly Speech-Text Alignment and Streaming Speech Decoder
Viaarxiv icon

ClapFM-EVC: High-Fidelity and Flexible Emotional Voice Conversion with Dual Control from Natural Language and Speech

Add code
May 20, 2025
Figure 1 for ClapFM-EVC: High-Fidelity and Flexible Emotional Voice Conversion with Dual Control from Natural Language and Speech
Figure 2 for ClapFM-EVC: High-Fidelity and Flexible Emotional Voice Conversion with Dual Control from Natural Language and Speech
Figure 3 for ClapFM-EVC: High-Fidelity and Flexible Emotional Voice Conversion with Dual Control from Natural Language and Speech
Figure 4 for ClapFM-EVC: High-Fidelity and Flexible Emotional Voice Conversion with Dual Control from Natural Language and Speech
Viaarxiv icon