Picture for Pengyu Cheng

Pengyu Cheng

MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination

Add code
Mar 25, 2026
Viaarxiv icon

Borderless Long Speech Synthesis

Add code
Mar 20, 2026
Viaarxiv icon

CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR

Add code
Mar 10, 2026
Viaarxiv icon

Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidance

Add code
Dec 29, 2025
Viaarxiv icon

Search Self-play: Pushing the Frontier of Agent Capability without Supervision

Add code
Oct 21, 2025
Viaarxiv icon

Kimi-VL Technical Report

Add code
Apr 10, 2025
Figure 1 for Kimi-VL Technical Report
Figure 2 for Kimi-VL Technical Report
Figure 3 for Kimi-VL Technical Report
Figure 4 for Kimi-VL Technical Report
Viaarxiv icon

Detecting AI Flaws: Target-Driven Attacks on Internal Faults in Language Models

Add code
Aug 27, 2024
Figure 1 for Detecting AI Flaws: Target-Driven Attacks on Internal Faults in Language Models
Figure 2 for Detecting AI Flaws: Target-Driven Attacks on Internal Faults in Language Models
Figure 3 for Detecting AI Flaws: Target-Driven Attacks on Internal Faults in Language Models
Figure 4 for Detecting AI Flaws: Target-Driven Attacks on Internal Faults in Language Models
Viaarxiv icon

Self-playing Adversarial Language Game Enhances LLM Reasoning

Add code
Apr 16, 2024
Figure 1 for Self-playing Adversarial Language Game Enhances LLM Reasoning
Figure 2 for Self-playing Adversarial Language Game Enhances LLM Reasoning
Figure 3 for Self-playing Adversarial Language Game Enhances LLM Reasoning
Figure 4 for Self-playing Adversarial Language Game Enhances LLM Reasoning
Viaarxiv icon

On Diversified Preferences of Large Language Model Alignment

Add code
Dec 25, 2023
Figure 1 for On Diversified Preferences of Large Language Model Alignment
Figure 2 for On Diversified Preferences of Large Language Model Alignment
Figure 3 for On Diversified Preferences of Large Language Model Alignment
Figure 4 for On Diversified Preferences of Large Language Model Alignment
Viaarxiv icon

Adversarial Preference Optimization

Add code
Nov 14, 2023
Figure 1 for Adversarial Preference Optimization
Figure 2 for Adversarial Preference Optimization
Figure 3 for Adversarial Preference Optimization
Figure 4 for Adversarial Preference Optimization
Viaarxiv icon