Picture for Muyun Yang

Muyun Yang

Long-form RewardBench: Evaluating Reward Models for Long-form Generation

Add code
Mar 13, 2026
Viaarxiv icon

Toward Robust LLM-Based Judges: Taxonomic Bias Evaluation and Debiasing Optimization

Add code
Mar 09, 2026
Viaarxiv icon

Beyond Token-Level Policy Gradients for Complex Reasoning with Large Language Models

Add code
Feb 16, 2026
Viaarxiv icon

Thinking with Comics: Enhancing Multimodal Reasoning through Structured Visual Storytelling

Add code
Feb 03, 2026
Viaarxiv icon

RM-Distiller: Exploiting Generative LLM for Reward Model Distillation

Add code
Jan 20, 2026
Viaarxiv icon

DiVA: Fine-grained Factuality Verification with Agentic-Discriminative Verifier

Add code
Jan 07, 2026
Viaarxiv icon

Reasoning Model Is Superior LLM-Judge, Yet Suffers from Biases

Add code
Jan 07, 2026
Viaarxiv icon

Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory

Add code
May 21, 2025
Figure 1 for Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory
Figure 2 for Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory
Figure 3 for Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory
Figure 4 for Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory
Viaarxiv icon

Memory-augmented Query Reconstruction for LLM-based Knowledge Graph Reasoning

Add code
Mar 07, 2025
Figure 1 for Memory-augmented Query Reconstruction for LLM-based Knowledge Graph Reasoning
Figure 2 for Memory-augmented Query Reconstruction for LLM-based Knowledge Graph Reasoning
Figure 3 for Memory-augmented Query Reconstruction for LLM-based Knowledge Graph Reasoning
Figure 4 for Memory-augmented Query Reconstruction for LLM-based Knowledge Graph Reasoning
Viaarxiv icon

MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training

Add code
Feb 17, 2025
Figure 1 for MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training
Figure 2 for MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training
Figure 3 for MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training
Figure 4 for MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training
Viaarxiv icon