Picture for Shangmin Guo

Shangmin Guo

Language Model Evolution: An Iterated Learning Perspective

Add code
Apr 04, 2024
Viaarxiv icon

Direct Language Model Alignment from Online AI Feedback

Add code
Feb 07, 2024
Viaarxiv icon

Decoding-time Realignment of Language Models

Add code
Feb 05, 2024
Figure 1 for Decoding-time Realignment of Language Models
Figure 2 for Decoding-time Realignment of Language Models
Figure 3 for Decoding-time Realignment of Language Models
Figure 4 for Decoding-time Realignment of Language Models
Viaarxiv icon

ICED: Zero-Shot Transfer in Reinforcement Learning via In-Context Environment Design

Add code
Feb 05, 2024
Viaarxiv icon

Sample Relationship from Learning Dynamics Matters for Generalisation

Add code
Jan 16, 2024
Viaarxiv icon

How the level sampling process impacts zero-shot generalisation in deep reinforcement learning

Add code
Oct 05, 2023
Figure 1 for How the level sampling process impacts zero-shot generalisation in deep reinforcement learning
Figure 2 for How the level sampling process impacts zero-shot generalisation in deep reinforcement learning
Figure 3 for How the level sampling process impacts zero-shot generalisation in deep reinforcement learning
Figure 4 for How the level sampling process impacts zero-shot generalisation in deep reinforcement learning
Viaarxiv icon

How to prepare your task head for finetuning

Add code
Feb 11, 2023
Figure 1 for How to prepare your task head for finetuning
Figure 2 for How to prepare your task head for finetuning
Figure 3 for How to prepare your task head for finetuning
Figure 4 for How to prepare your task head for finetuning
Viaarxiv icon

Deep Reinforcement Learning for Multi-Agent Interaction

Add code
Aug 02, 2022
Viaarxiv icon

Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation

Add code
Mar 15, 2022
Figure 1 for Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation
Figure 2 for Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation
Figure 3 for Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation
Figure 4 for Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation
Viaarxiv icon

Better Supervisory Signals by Observing Learning Paths

Add code
Mar 04, 2022
Figure 1 for Better Supervisory Signals by Observing Learning Paths
Figure 2 for Better Supervisory Signals by Observing Learning Paths
Figure 3 for Better Supervisory Signals by Observing Learning Paths
Figure 4 for Better Supervisory Signals by Observing Learning Paths
Viaarxiv icon