Picture for Alexander Golubev

Alexander Golubev

SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale

Add code
Feb 27, 2026
Viaarxiv icon

LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding

Add code
Feb 27, 2026
Viaarxiv icon

Blockwise Advantage Estimation for Multi-Objective RL with Verifiable Rewards

Add code
Feb 10, 2026
Viaarxiv icon

SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Add code
May 26, 2025
Viaarxiv icon

Guided Search Strategies in Non-Serializable Environments with Applications to Software Engineering Agents

Add code
May 19, 2025
Viaarxiv icon

Variance Reduction for Policy-Gradient Methods via Empirical Variance Minimization

Add code
Jun 15, 2022
Figure 1 for Variance Reduction for Policy-Gradient Methods via Empirical Variance Minimization
Figure 2 for Variance Reduction for Policy-Gradient Methods via Empirical Variance Minimization
Figure 3 for Variance Reduction for Policy-Gradient Methods via Empirical Variance Minimization
Figure 4 for Variance Reduction for Policy-Gradient Methods via Empirical Variance Minimization
Viaarxiv icon