Picture for Kunhao Zheng

Kunhao Zheng

WybeCoder: Verified Imperative Code Generation

Add code
Mar 31, 2026
Viaarxiv icon

Optimizing Language Models for Inference Time Objectives using Reinforcement Learning

Add code
Mar 25, 2025
Viaarxiv icon

The KoLMogorov Test: Compression by Code Generation

Add code
Mar 18, 2025
Viaarxiv icon

Soft Policy Optimization: Online Off-Policy RL for Sequence Models

Add code
Mar 07, 2025
Figure 1 for Soft Policy Optimization: Online Off-Policy RL for Sequence Models
Figure 2 for Soft Policy Optimization: Online Off-Policy RL for Sequence Models
Viaarxiv icon

PILAF: Optimal Human Preference Sampling for Reward Modeling

Add code
Feb 06, 2025
Figure 1 for PILAF: Optimal Human Preference Sampling for Reward Modeling
Figure 2 for PILAF: Optimal Human Preference Sampling for Reward Modeling
Figure 3 for PILAF: Optimal Human Preference Sampling for Reward Modeling
Figure 4 for PILAF: Optimal Human Preference Sampling for Reward Modeling
Viaarxiv icon

What Makes Large Language Models Reason in (Multi-Turn) Code Generation?

Add code
Oct 10, 2024
Figure 1 for What Makes Large Language Models Reason in (Multi-Turn) Code Generation?
Figure 2 for What Makes Large Language Models Reason in (Multi-Turn) Code Generation?
Figure 3 for What Makes Large Language Models Reason in (Multi-Turn) Code Generation?
Figure 4 for What Makes Large Language Models Reason in (Multi-Turn) Code Generation?
Viaarxiv icon

RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning

Add code
Oct 02, 2024
Figure 1 for RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning
Figure 2 for RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning
Figure 3 for RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning
Figure 4 for RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning
Viaarxiv icon

D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory

Add code
Mar 01, 2023
Figure 1 for D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
Figure 2 for D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
Figure 3 for D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
Figure 4 for D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
Viaarxiv icon

Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization

Add code
Dec 19, 2022
Figure 1 for Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
Figure 2 for Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
Figure 3 for Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
Figure 4 for Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
Viaarxiv icon

Formal Mathematics Statement Curriculum Learning

Add code
Feb 03, 2022
Figure 1 for Formal Mathematics Statement Curriculum Learning
Figure 2 for Formal Mathematics Statement Curriculum Learning
Figure 3 for Formal Mathematics Statement Curriculum Learning
Figure 4 for Formal Mathematics Statement Curriculum Learning
Viaarxiv icon