Alert button
Picture for Shangmin Guo

Shangmin Guo

Alert button

Language Model Evolution: An Iterated Learning Perspective

Add code
Bookmark button
Alert button
Apr 04, 2024
Yi Ren, Shangmin Guo, Linlu Qiu, Bailin Wang, Danica J. Sutherland

Viaarxiv icon

Direct Language Model Alignment from Online AI Feedback

Add code
Bookmark button
Alert button
Feb 07, 2024
Shangmin Guo, Biao Zhang, Tianlin Liu, Tianqi Liu, Misha Khalman, Felipe Llinares, Alexandre Rame, Thomas Mesnard, Yao Zhao, Bilal Piot, Johan Ferret, Mathieu Blondel

Viaarxiv icon

ICED: Zero-Shot Transfer in Reinforcement Learning via In-Context Environment Design

Add code
Bookmark button
Alert button
Feb 05, 2024
Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

Viaarxiv icon

Decoding-time Realignment of Language Models

Add code
Bookmark button
Alert button
Feb 05, 2024
Tianlin Liu, Shangmin Guo, Leonardo Bianco, Daniele Calandriello, Quentin Berthet, Felipe Llinares, Jessica Hoffmann, Lucas Dixon, Michal Valko, Mathieu Blondel

Viaarxiv icon

Sample Relationship from Learning Dynamics Matters for Generalisation

Add code
Bookmark button
Alert button
Jan 16, 2024
Shangmin Guo, Yi Ren, Stefano V. Albrecht, Kenny Smith

Viaarxiv icon

How the level sampling process impacts zero-shot generalisation in deep reinforcement learning

Add code
Bookmark button
Alert button
Oct 05, 2023
Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

Viaarxiv icon

How to prepare your task head for finetuning

Add code
Bookmark button
Alert button
Feb 11, 2023
Yi Ren, Shangmin Guo, Wonho Bae, Danica J. Sutherland

Figure 1 for How to prepare your task head for finetuning
Figure 2 for How to prepare your task head for finetuning
Figure 3 for How to prepare your task head for finetuning
Figure 4 for How to prepare your task head for finetuning
Viaarxiv icon

Deep Reinforcement Learning for Multi-Agent Interaction

Add code
Bookmark button
Alert button
Aug 02, 2022
Ibrahim H. Ahmed, Cillian Brewitt, Ignacio Carlucho, Filippos Christianos, Mhairi Dunion, Elliot Fosong, Samuel Garcin, Shangmin Guo, Balint Gyevnar, Trevor McInroe, Georgios Papoudakis, Arrasy Rahman, Lukas Schäfer, Massimiliano Tamborski, Giuseppe Vecchio, Cheng Wang, Stefano V. Albrecht

Viaarxiv icon

Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation

Add code
Bookmark button
Alert button
Mar 15, 2022
Runfa Chen, Yu Rong, Shangmin Guo, Jiaqi Han, Fuchun Sun, Tingyang Xu, Wenbing Huang

Figure 1 for Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation
Figure 2 for Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation
Figure 3 for Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation
Figure 4 for Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation
Viaarxiv icon

Better Supervisory Signals by Observing Learning Paths

Add code
Bookmark button
Alert button
Mar 04, 2022
Yi Ren, Shangmin Guo, Danica J. Sutherland

Figure 1 for Better Supervisory Signals by Observing Learning Paths
Figure 2 for Better Supervisory Signals by Observing Learning Paths
Figure 3 for Better Supervisory Signals by Observing Learning Paths
Figure 4 for Better Supervisory Signals by Observing Learning Paths
Viaarxiv icon