Picture for Tiejun Zhao

Tiejun Zhao

Long-form RewardBench: Evaluating Reward Models for Long-form Generation

Add code
Mar 13, 2026
Viaarxiv icon

Toward Robust LLM-Based Judges: Taxonomic Bias Evaluation and Debiasing Optimization

Add code
Mar 09, 2026
Viaarxiv icon

Beyond Token-Level Policy Gradients for Complex Reasoning with Large Language Models

Add code
Feb 16, 2026
Viaarxiv icon

Thinking with Comics: Enhancing Multimodal Reasoning through Structured Visual Storytelling

Add code
Feb 03, 2026
Viaarxiv icon

RM-Distiller: Exploiting Generative LLM for Reward Model Distillation

Add code
Jan 20, 2026
Viaarxiv icon

From Perception to Reasoning: Deep Thinking Empowers Multimodal Large Language Models

Add code
Nov 18, 2025
Viaarxiv icon

ToolSample: Dual Dynamic Sampling Methods with Curriculum Learning for RL-based Tool Learning

Add code
Sep 18, 2025
Viaarxiv icon

Enhancing Large Language Models'Machine Translation via Dynamic Focus Anchoring

Add code
May 29, 2025
Viaarxiv icon

Cross-Domain Bilingual Lexicon Induction via Pretrained Language Models

Add code
May 29, 2025
Viaarxiv icon

Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design

Add code
May 29, 2025
Viaarxiv icon