Picture for Yunpeng Zhai

Yunpeng Zhai

Peking University

Towards Robust LLM Post-Training: Automatic Failure Management for Reinforcement Fine-Tuning

Add code
May 06, 2026
Viaarxiv icon

E2E-REME: Towards End-to-End Microservices Auto-Remediation via Experience-Simulation Reinforcement Fine-Tuning

Add code
Apr 13, 2026
Viaarxiv icon

Hypothesize-Then-Verify: Speculative Root Cause Analysis for Microservices with Pathwise Parallelism

Add code
Jan 06, 2026
Viaarxiv icon

Agentic Memory Enhanced Recursive Reasoning for Root Cause Localization in Microservices

Add code
Jan 06, 2026
Viaarxiv icon

d-TreeRPO: Towards More Reliable Policy Optimization for Diffusion Language Models

Add code
Dec 10, 2025
Viaarxiv icon

AgentEvolver: Towards Efficient Self-Evolving Agent System

Add code
Nov 13, 2025
Viaarxiv icon

A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models

Add code
Aug 12, 2025
Figure 1 for A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
Figure 2 for A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
Figure 3 for A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
Figure 4 for A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
Viaarxiv icon

Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models

Add code
Aug 10, 2025
Figure 1 for Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models
Figure 2 for Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models
Figure 3 for Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models
Figure 4 for Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models
Viaarxiv icon

Provoking Multi-modal Few-Shot LVLM via Exploration-Exploitation In-Context Learning

Add code
Jun 11, 2025
Viaarxiv icon

Population-Based Evolutionary Gaming for Unsupervised Person Re-identification

Add code
Jun 08, 2023
Viaarxiv icon