Picture for Dale Schuurmans

Dale Schuurmans

University of Alberta

Soft Preference Optimization: Aligning Language Models to Expert Distributions

Add code
Apr 30, 2024
Figure 1 for Soft Preference Optimization: Aligning Language Models to Expert Distributions
Viaarxiv icon

Stochastic Gradient Succeeds for Bandits

Add code
Feb 27, 2024
Figure 1 for Stochastic Gradient Succeeds for Bandits
Figure 2 for Stochastic Gradient Succeeds for Bandits
Figure 3 for Stochastic Gradient Succeeds for Bandits
Figure 4 for Stochastic Gradient Succeeds for Bandits
Viaarxiv icon

Video as the New Language for Real-World Decision Making

Add code
Feb 27, 2024
Figure 1 for Video as the New Language for Real-World Decision Making
Figure 2 for Video as the New Language for Real-World Decision Making
Figure 3 for Video as the New Language for Real-World Decision Making
Figure 4 for Video as the New Language for Real-World Decision Making
Viaarxiv icon

Beyond Expectations: Learning with Stochastic Dominance Made Practical

Add code
Feb 05, 2024
Figure 1 for Beyond Expectations: Learning with Stochastic Dominance Made Practical
Figure 2 for Beyond Expectations: Learning with Stochastic Dominance Made Practical
Figure 3 for Beyond Expectations: Learning with Stochastic Dominance Made Practical
Figure 4 for Beyond Expectations: Learning with Stochastic Dominance Made Practical
Viaarxiv icon

Curvature Explains Loss of Plasticity

Add code
Nov 30, 2023
Figure 1 for Curvature Explains Loss of Plasticity
Figure 2 for Curvature Explains Loss of Plasticity
Figure 3 for Curvature Explains Loss of Plasticity
Figure 4 for Curvature Explains Loss of Plasticity
Viaarxiv icon

Provable Representation with Efficient Planning for Partially Observable Reinforcement Learning

Add code
Nov 20, 2023
Figure 1 for Provable Representation with Efficient Planning for Partially Observable Reinforcement Learning
Figure 2 for Provable Representation with Efficient Planning for Partially Observable Reinforcement Learning
Figure 3 for Provable Representation with Efficient Planning for Partially Observable Reinforcement Learning
Figure 4 for Provable Representation with Efficient Planning for Partially Observable Reinforcement Learning
Viaarxiv icon

Large Language Models can Learn Rules

Add code
Oct 10, 2023
Figure 1 for Large Language Models can Learn Rules
Figure 2 for Large Language Models can Learn Rules
Figure 3 for Large Language Models can Learn Rules
Figure 4 for Large Language Models can Learn Rules
Viaarxiv icon

Learning Interactive Real-World Simulators

Add code
Oct 09, 2023
Figure 1 for Learning Interactive Real-World Simulators
Figure 2 for Learning Interactive Real-World Simulators
Figure 3 for Learning Interactive Real-World Simulators
Figure 4 for Learning Interactive Real-World Simulators
Viaarxiv icon

Probabilistic Adaptation of Text-to-Video Models

Add code
Jun 02, 2023
Figure 1 for Probabilistic Adaptation of Text-to-Video Models
Figure 2 for Probabilistic Adaptation of Text-to-Video Models
Figure 3 for Probabilistic Adaptation of Text-to-Video Models
Figure 4 for Probabilistic Adaptation of Text-to-Video Models
Viaarxiv icon

Gradient-Free Structured Pruning with Unlabeled Data

Add code
Mar 07, 2023
Figure 1 for Gradient-Free Structured Pruning with Unlabeled Data
Figure 2 for Gradient-Free Structured Pruning with Unlabeled Data
Figure 3 for Gradient-Free Structured Pruning with Unlabeled Data
Figure 4 for Gradient-Free Structured Pruning with Unlabeled Data
Viaarxiv icon