Picture for Leyi Pan

Leyi Pan

RLCSD: Reinforcement Learning with Contrastive On-Policy Self-Distillation

Add code
Jun 10, 2026
Viaarxiv icon

Baichuan-M4: A Clinical-Grade Medical Agent System for Continuous Care

Add code
Jun 09, 2026
Viaarxiv icon

Locally Confident, Globally Stuck: The Quality-Exploration Dilemma in Diffusion Language Models

Add code
Apr 01, 2026
Viaarxiv icon

Agentic Memory Enhanced Recursive Reasoning for Root Cause Localization in Microservices

Add code
Jan 06, 2026
Viaarxiv icon

Hypothesize-Then-Verify: Speculative Root Cause Analysis for Microservices with Pathwise Parallelism

Add code
Jan 06, 2026
Viaarxiv icon

d-TreeRPO: Towards More Reliable Policy Optimization for Diffusion Language Models

Add code
Dec 10, 2025
Viaarxiv icon

A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models

Add code
Aug 12, 2025
Figure 1 for A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
Figure 2 for A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
Figure 3 for A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
Figure 4 for A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
Viaarxiv icon

Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models

Add code
Aug 10, 2025
Figure 1 for Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models
Figure 2 for Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models
Figure 3 for Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models
Figure 4 for Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models
Viaarxiv icon

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Add code
Jul 02, 2025
Figure 1 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 2 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 3 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Figure 4 for GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Viaarxiv icon

Can LLM Watermarks Robustly Prevent Unauthorized Knowledge Distillation?

Add code
Feb 17, 2025
Viaarxiv icon