Picture for Debing Zhang

Debing Zhang

Tackling Length Inflation Without Trade-offs: Group Relative Reward Rescaling for Reinforcement Learning

Add code
Mar 11, 2026
Viaarxiv icon

SEA-Nav: Efficient Policy Learning for Safe and Agile Quadruped Navigation in Cluttered Environments

Add code
Mar 10, 2026
Viaarxiv icon

JTok: On Token Embedding as another Axis of Scaling Law via Joint Token Self-modulation

Add code
Jan 31, 2026
Viaarxiv icon

LongBench Pro: A More Realistic and Comprehensive Bilingual Long-Context Evaluation Benchmark

Add code
Jan 06, 2026
Viaarxiv icon

Coupled Variational Reinforcement Learning for Language Model General Reasoning

Add code
Dec 14, 2025
Viaarxiv icon

LiteLong: Resource-Efficient Long-Context Data Synthesis for LLMs

Add code
Sep 19, 2025
Figure 1 for LiteLong: Resource-Efficient Long-Context Data Synthesis for LLMs
Figure 2 for LiteLong: Resource-Efficient Long-Context Data Synthesis for LLMs
Figure 3 for LiteLong: Resource-Efficient Long-Context Data Synthesis for LLMs
Figure 4 for LiteLong: Resource-Efficient Long-Context Data Synthesis for LLMs
Viaarxiv icon

Flow-Anything: Learning Real-World Optical Flow Estimation from Large-Scale Single-view Images

Add code
Jun 09, 2025
Viaarxiv icon

dots.llm1 Technical Report

Add code
Jun 06, 2025
Viaarxiv icon

Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction

Add code
May 27, 2025
Viaarxiv icon

LongMagpie: A Self-synthesis Method for Generating Large-scale Long-context Instructions

Add code
May 22, 2025
Viaarxiv icon