Picture for Shikun Zhang

Shikun Zhang

Talker-T2AV: Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling

Add code
Apr 26, 2026
Viaarxiv icon

EVE: Verifiable Self-Evolution of MLLMs via Executable Visual Transformations

Add code
Apr 20, 2026
Viaarxiv icon

Retrieval as Generation: A Unified Framework with Self-Triggered Information Planning

Add code
Apr 13, 2026
Viaarxiv icon

Instruction Data Selection via Answer Divergence

Add code
Apr 12, 2026
Viaarxiv icon

Data Selection for Multi-turn Dialogue Instruction Tuning

Add code
Apr 09, 2026
Viaarxiv icon

SteerRM: Debiasing Reward Models via Sparse Autoencoders

Add code
Mar 13, 2026
Viaarxiv icon

From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models

Add code
Feb 26, 2026
Viaarxiv icon

Advancing Block Diffusion Language Models for Test-Time Scaling

Add code
Feb 11, 2026
Viaarxiv icon

OPE: Overcoming Information Saturation in Parallel Thinking via Outline-Guided Path Exploration

Add code
Feb 09, 2026
Viaarxiv icon

What Do Agents Learn from Trajectory-SFT: Semantics or Interfaces?

Add code
Feb 02, 2026
Viaarxiv icon