Picture for Natasha Jaques

Natasha Jaques

AgenticRed: Optimizing Agentic Systems for Automated Red-teaming

Add code
Jan 20, 2026
Viaarxiv icon

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Add code
Nov 10, 2025
Figure 1 for RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
Figure 2 for RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
Figure 3 for RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
Figure 4 for RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
Viaarxiv icon

Evaluating & Reducing Deceptive Dialogue From Language Models with Multi-turn RL

Add code
Oct 16, 2025
Viaarxiv icon

Generative Modeling for Robust Deep Reinforcement Learning on the Traveling Salesman Problem

Add code
Aug 12, 2025
Figure 1 for Generative Modeling for Robust Deep Reinforcement Learning on the Traveling Salesman Problem
Figure 2 for Generative Modeling for Robust Deep Reinforcement Learning on the Traveling Salesman Problem
Figure 3 for Generative Modeling for Robust Deep Reinforcement Learning on the Traveling Salesman Problem
Figure 4 for Generative Modeling for Robust Deep Reinforcement Learning on the Traveling Salesman Problem
Viaarxiv icon

Learning Pluralistic User Preferences through Reinforcement Learning Fine-tuned Summaries

Add code
Jul 17, 2025
Figure 1 for Learning Pluralistic User Preferences through Reinforcement Learning Fine-tuned Summaries
Figure 2 for Learning Pluralistic User Preferences through Reinforcement Learning Fine-tuned Summaries
Figure 3 for Learning Pluralistic User Preferences through Reinforcement Learning Fine-tuned Summaries
Figure 4 for Learning Pluralistic User Preferences through Reinforcement Learning Fine-tuned Summaries
Viaarxiv icon

Adaptive Accompaniment with ReaLchords

Add code
Jun 17, 2025
Viaarxiv icon

Chasing Moving Targets with Online Self-Play Reinforcement Learning for Safer Language Models

Add code
Jun 09, 2025
Viaarxiv icon

Improving Human-AI Coordination through Adversarial Training and Generative Models

Add code
Apr 21, 2025
Viaarxiv icon

Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination

Add code
Apr 20, 2025
Figure 1 for Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Figure 2 for Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Figure 3 for Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Figure 4 for Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Viaarxiv icon

Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward

Add code
Apr 04, 2025
Figure 1 for Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward
Figure 2 for Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward
Figure 3 for Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward
Figure 4 for Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward
Viaarxiv icon