Picture for Gyouk Chu

Gyouk Chu

Discounted Beta--Bernoulli Reward Estimation for Sample-Efficient Reinforcement Learning with Verifiable Rewards

Add code
Mar 19, 2026
Viaarxiv icon

Argument Reconstruction as Supervision for Critical Thinking in LLMs

Add code
Mar 18, 2026
Viaarxiv icon

Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models

Add code
Feb 18, 2025
Figure 1 for Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models
Figure 2 for Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models
Figure 3 for Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models
Figure 4 for Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models
Viaarxiv icon