Picture for Daniil Tiapkin

Daniil Tiapkin

CMAP, LMO

On Global Convergence Rates for Federated Policy Gradient under Heterogeneous Environment

Add code
May 29, 2025
Viaarxiv icon

Finite-Sample Convergence Bounds for Trust Region Policy Optimization in Mean-Field Games

Add code
May 28, 2025
Viaarxiv icon

Accelerating Nash Learning from Human Feedback via Mirror Prox

Add code
May 26, 2025
Viaarxiv icon

Revisiting Non-Acyclic GFlowNets in Discrete Environments

Add code
Feb 11, 2025
Viaarxiv icon

On Teacher Hacking in Language Model Distillation

Add code
Feb 04, 2025
Figure 1 for On Teacher Hacking in Language Model Distillation
Figure 2 for On Teacher Hacking in Language Model Distillation
Figure 3 for On Teacher Hacking in Language Model Distillation
Figure 4 for On Teacher Hacking in Language Model Distillation
Viaarxiv icon

Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents

Add code
Oct 30, 2024
Viaarxiv icon

Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization

Add code
Oct 20, 2024
Figure 1 for Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Figure 2 for Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Figure 3 for Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Figure 4 for Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Viaarxiv icon

Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization

Add code
Jul 08, 2024
Viaarxiv icon

Improving GFlowNets with Monte Carlo Tree Search

Add code
Jun 19, 2024
Viaarxiv icon

Incentivized Learning in Principal-Agent Bandit Games

Add code
Mar 06, 2024
Figure 1 for Incentivized Learning in Principal-Agent Bandit Games
Figure 2 for Incentivized Learning in Principal-Agent Bandit Games
Figure 3 for Incentivized Learning in Principal-Agent Bandit Games
Figure 4 for Incentivized Learning in Principal-Agent Bandit Games
Viaarxiv icon