Picture for Chen Zhang

Chen Zhang

SenseTime Research

Large Language Models Do Not Always Need Readable Language

Add code
Jun 18, 2026
Viaarxiv icon

FoleyGenEx: Unified Video-to-Audio Generation with Multi-Modal Control, Temporal Alignment, and Semantic Precision

Add code
Jun 12, 2026
Viaarxiv icon

Self-Harness: Harnesses That Improve Themselves

Add code
Jun 08, 2026
Viaarxiv icon

Towards Unified Song Generation and Singing Voice Conversion with Accompaniment Co-Generation

Add code
Jun 05, 2026
Viaarxiv icon

RhinoVLA Technical Report

Add code
Jun 05, 2026
Viaarxiv icon

Quantifying the Energy Floor: Direct Measurement and Replay Buffer Bias in SAC-Based HVAC Control on sbsim

Add code
Jun 01, 2026
Viaarxiv icon

SegTune: Structured and Fine-Grained Control for Song Generation

Add code
May 31, 2026
Viaarxiv icon

ATLAS: All-round Testing of Long-context Abilities across Scales

Add code
May 27, 2026
Viaarxiv icon

Learning to Adapt SFT Data for Better Reasoning Generalization

Add code
May 26, 2026
Viaarxiv icon

OmniISR: A Unified Framework for Centralized and Federated Learning via Intermediate Supervision and Regularization

Add code
May 19, 2026
Viaarxiv icon