Picture for Tiejun Zhao

Tiejun Zhao

Long-form RewardBench: Evaluating Reward Models for Long-form Generation

Add code
Mar 13, 2026
Viaarxiv icon

Toward Robust LLM-Based Judges: Taxonomic Bias Evaluation and Debiasing Optimization

Add code
Mar 09, 2026
Viaarxiv icon

Beyond Token-Level Policy Gradients for Complex Reasoning with Large Language Models

Add code
Feb 16, 2026
Viaarxiv icon

Thinking with Comics: Enhancing Multimodal Reasoning through Structured Visual Storytelling

Add code
Feb 03, 2026
Viaarxiv icon

RM-Distiller: Exploiting Generative LLM for Reward Model Distillation

Add code
Jan 20, 2026
Viaarxiv icon

From Perception to Reasoning: Deep Thinking Empowers Multimodal Large Language Models

Add code
Nov 18, 2025
Viaarxiv icon

ToolSample: Dual Dynamic Sampling Methods with Curriculum Learning for RL-based Tool Learning

Add code
Sep 18, 2025
Viaarxiv icon

Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design

Add code
May 29, 2025
Viaarxiv icon

Cross-Domain Bilingual Lexicon Induction via Pretrained Language Models

Add code
May 29, 2025
Viaarxiv icon

Enhancing Large Language Models'Machine Translation via Dynamic Focus Anchoring

Add code
May 29, 2025
Viaarxiv icon