Picture for Mateusz Ostaszewski

Mateusz Ostaszewski

Huxley-Gödel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine

Add code
Oct 24, 2025
Viaarxiv icon

Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control

Add code
May 25, 2024
Figure 1 for Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control
Figure 2 for Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control
Figure 3 for Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control
Figure 4 for Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control
Viaarxiv icon

Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning

Add code
Mar 01, 2024
Figure 1 for Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
Figure 2 for Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
Figure 3 for Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
Figure 4 for Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
Viaarxiv icon

A Case for Validation Buffer in Pessimistic Actor-Critic

Add code
Mar 01, 2024
Figure 1 for A Case for Validation Buffer in Pessimistic Actor-Critic
Figure 2 for A Case for Validation Buffer in Pessimistic Actor-Critic
Figure 3 for A Case for Validation Buffer in Pessimistic Actor-Critic
Figure 4 for A Case for Validation Buffer in Pessimistic Actor-Critic
Viaarxiv icon

Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem

Add code
Feb 05, 2024
Figure 1 for Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
Figure 2 for Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
Figure 3 for Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
Figure 4 for Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
Viaarxiv icon

Curriculum reinforcement learning for quantum architecture search under hardware errors

Add code
Feb 05, 2024
Viaarxiv icon

On consequences of finetuning on data with highly discriminative features

Add code
Oct 30, 2023
Figure 1 for On consequences of finetuning on data with highly discriminative features
Figure 2 for On consequences of finetuning on data with highly discriminative features
Figure 3 for On consequences of finetuning on data with highly discriminative features
Figure 4 for On consequences of finetuning on data with highly discriminative features
Viaarxiv icon

Enhancing variational quantum state diagonalization using reinforcement learning techniques

Add code
Jun 22, 2023
Viaarxiv icon

The Tunnel Effect: Building Data Representations in Deep Neural Networks

Add code
May 31, 2023
Viaarxiv icon

Emergency action termination for immediate reaction in hierarchical reinforcement learning

Add code
Nov 11, 2022
Viaarxiv icon