Picture for Lun Wang

Lun Wang

University of California, Berkeley

On the Stability of Prompt Ranking in Large Language Model Evaluation

Add code
Jun 23, 2026
Viaarxiv icon

Steering LLMs for Culturally Localized Generation

Add code
Mar 24, 2026
Viaarxiv icon

OpenSage: Self-programming Agent Generation Engine

Add code
Feb 18, 2026
Viaarxiv icon

rePIRL: Learn PRM with Inverse RL for LLM Reasoning

Add code
Feb 08, 2026
Viaarxiv icon

TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents

Add code
Feb 06, 2026
Viaarxiv icon

DevOps-Gym: Benchmarking AI Agents in Software DevOps Cycle

Add code
Jan 27, 2026
Viaarxiv icon

MUSIC: MUlti-Step Instruction Contrast for Multi-Turn Reward Models

Add code
Dec 31, 2025
Viaarxiv icon

Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process

Add code
Dec 30, 2025
Viaarxiv icon

Eliciting Behaviors in Multi-Turn Conversations

Add code
Dec 29, 2025
Viaarxiv icon

Comparative Analysis of Large Language Models for Context-Aware Code Completion using SAFIM Framework

Add code
Feb 21, 2025
Figure 1 for Comparative Analysis of Large Language Models for Context-Aware Code Completion using SAFIM Framework
Viaarxiv icon