Picture for Daiting Shi

Daiting Shi

ConsistRM: Improving Generative Reward Models via Consistency-Aware Self-Training

Add code
Apr 08, 2026
Viaarxiv icon

ReflectRM: Boosting Generative Reward Models via Self-Reflection within a Unified Judgment Framework

Add code
Apr 08, 2026
Viaarxiv icon

UniCreative: Unifying Long-form Logic and Short-form Sparkle via Reference-Free Reinforcement Learning

Add code
Apr 07, 2026
Viaarxiv icon

Reconstructing Content via Collaborative Attention to Improve Multimodal Embedding Quality

Add code
Mar 02, 2026
Viaarxiv icon

When Less is More: The LLM Scaling Paradox in Context Compression

Add code
Feb 10, 2026
Viaarxiv icon

Bagging-Based Model Merging for Robust General Text Embeddings

Add code
Feb 05, 2026
Viaarxiv icon

TRE: Encouraging Exploration in the Trust Region

Add code
Feb 03, 2026
Viaarxiv icon

Advancing General-Purpose Reasoning Models with Modular Gradient Surgery

Add code
Feb 02, 2026
Viaarxiv icon

Agentic-R: Learning to Retrieve for Agentic Search

Add code
Jan 17, 2026
Viaarxiv icon

Reinforced Efficient Reasoning via Semantically Diverse Exploration

Add code
Jan 08, 2026
Viaarxiv icon