Picture for Pascal Poupart

Pascal Poupart

University of Waterloo

A Critical Look At Tokenwise Reward-Guided Text Generation

Jun 12, 2024
Viaarxiv icon

How Useful is Intermittent, Asynchronous Expert Feedback for Bayesian Optimization?

Add code
Jun 10, 2024
Viaarxiv icon

A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization

Mar 20, 2024
Figure 1 for A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Figure 2 for A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Figure 3 for A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Figure 4 for A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Viaarxiv icon

Why Online Reinforcement Learning is Causal

Mar 07, 2024
Figure 1 for Why Online Reinforcement Learning is Causal
Figure 2 for Why Online Reinforcement Learning is Causal
Figure 3 for Why Online Reinforcement Learning is Causal
Figure 4 for Why Online Reinforcement Learning is Causal
Viaarxiv icon

A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?

Add code
Feb 07, 2024
Viaarxiv icon

Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space

Add code
Dec 15, 2023
Viaarxiv icon

Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks

Add code
Nov 07, 2023
Viaarxiv icon

An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient

Add code
Aug 09, 2023
Figure 1 for An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient
Figure 2 for An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient
Figure 3 for An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient
Figure 4 for An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient
Viaarxiv icon

Attribute Controlled Dialogue Prompting

Jul 11, 2023
Figure 1 for Attribute Controlled Dialogue Prompting
Figure 2 for Attribute Controlled Dialogue Prompting
Figure 3 for Attribute Controlled Dialogue Prompting
Figure 4 for Attribute Controlled Dialogue Prompting
Viaarxiv icon

Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization

Dec 12, 2022
Figure 1 for Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization
Figure 2 for Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization
Figure 3 for Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization
Figure 4 for Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization
Viaarxiv icon