Picture for Pengcheng He

Pengcheng He

Latent Recurrent Transformer: Architecture Exploration, Training Strategies, and Scaling Behavior

Add code
May 26, 2026
Viaarxiv icon

Orchard: An Open-Source Agentic Modeling Framework

Add code
May 14, 2026
Viaarxiv icon

Reinforcement World Model Learning for LLM-based Agents

Add code
Feb 05, 2026
Viaarxiv icon

RL from Teacher-Model Refinement: Gradual Imitation Learning for Machine Translation

Add code
Jul 29, 2025
Figure 1 for RL from Teacher-Model Refinement: Gradual Imitation Learning for Machine Translation
Figure 2 for RL from Teacher-Model Refinement: Gradual Imitation Learning for Machine Translation
Figure 3 for RL from Teacher-Model Refinement: Gradual Imitation Learning for Machine Translation
Figure 4 for RL from Teacher-Model Refinement: Gradual Imitation Learning for Machine Translation
Viaarxiv icon

Chain of Draft: Thinking Faster by Writing Less

Add code
Feb 25, 2025
Figure 1 for Chain of Draft: Thinking Faster by Writing Less
Figure 2 for Chain of Draft: Thinking Faster by Writing Less
Figure 3 for Chain of Draft: Thinking Faster by Writing Less
Figure 4 for Chain of Draft: Thinking Faster by Writing Less
Viaarxiv icon

Switchable Decision: Dynamic Neural Generation Networks

Add code
May 07, 2024
Viaarxiv icon

LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

Add code
Oct 23, 2023
Figure 1 for LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models
Figure 2 for LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models
Figure 3 for LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models
Figure 4 for LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models
Viaarxiv icon

Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective

Add code
Oct 17, 2023
Figure 1 for Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective
Figure 2 for Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective
Figure 3 for Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective
Figure 4 for Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective
Viaarxiv icon

Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling

Add code
Oct 10, 2023
Figure 1 for Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling
Figure 2 for Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling
Figure 3 for Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling
Figure 4 for Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling
Viaarxiv icon

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Add code
Sep 07, 2023
Figure 1 for DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Figure 2 for DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Figure 3 for DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Figure 4 for DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Viaarxiv icon