Alert button
Picture for Yee Whye Teh

Yee Whye Teh

Alert button

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Add code
Bookmark button
Alert button
Apr 11, 2024
Aleksandar Botev, Soham De, Samuel L Smith, Anushan Fernando, George-Cristian Muraru, Ruba Haroun, Leonard Berrada, Razvan Pascanu, Pier Giuseppe Sessa, Robert Dadashi, Léonard Hussenot, Johan Ferret, Sertan Girgin, Olivier Bachem, Alek Andreev, Kathleen Kenealy, Thomas Mesnard, Cassidy Hardin, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Armand Joulin, Noah Fiedel, Evan Senter, Yutian Chen, Srivatsan Srinivasan, Guillaume Desjardins, David Budden, Arnaud Doucet, Sharad Vikram, Adam Paszke, Trevor Gale, Sebastian Borgeaud, Charlie Chen, Andy Brock, Antonia Paterson, Jenny Brennan, Meg Risdal, Raj Gundluru, Nesh Devanathan, Paul Mooney, Nilay Chauhan, Phil Culliton, Luiz GUStavo Martins, Elisa Bandy, David Huntsperger, Glenn Cameron, Arthur Zucker, Tris Warkentin, Ludovic Peran, Minh Giang, Zoubin Ghahramani, Clément Farabet, Koray Kavukcuoglu, Demis Hassabis, Raia Hadsell, Yee Whye Teh, Nando de Frietas

Viaarxiv icon

Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts

Add code
Bookmark button
Alert button
Mar 13, 2024
Shengzhuang Chen, Jihoon Tack, Yunqiao Yang, Yee Whye Teh, Jonathan Richard Schwarz, Ying Wei

Figure 1 for Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts
Figure 2 for Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts
Figure 3 for Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts
Figure 4 for Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts
Viaarxiv icon

Online Adaptation of Language Models with a Memory of Amortized Contexts

Add code
Bookmark button
Alert button
Mar 07, 2024
Jihoon Tack, Jaehyung Kim, Eric Mitchell, Jinwoo Shin, Yee Whye Teh, Jonathan Richard Schwarz

Figure 1 for Online Adaptation of Language Models with a Memory of Amortized Contexts
Figure 2 for Online Adaptation of Language Models with a Memory of Amortized Contexts
Figure 3 for Online Adaptation of Language Models with a Memory of Amortized Contexts
Figure 4 for Online Adaptation of Language Models with a Memory of Amortized Contexts
Viaarxiv icon

Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models

Add code
Bookmark button
Alert button
Mar 03, 2024
Amal Rannen-Triki, Jorg Bornschein, Razvan Pascanu, Marcus Hutter, Andras György, Alexandre Galashov, Yee Whye Teh, Michalis K. Titsias

Figure 1 for Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models
Figure 2 for Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models
Figure 3 for Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models
Figure 4 for Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models
Viaarxiv icon

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Add code
Bookmark button
Alert button
Feb 29, 2024
Soham De, Samuel L. Smith, Anushan Fernando, Aleksandar Botev, George Cristian-Muraru, Albert Gu, Ruba Haroun, Leonard Berrada, Yutian Chen, Srivatsan Srinivasan, Guillaume Desjardins, Arnaud Doucet, David Budden, Yee Whye Teh, Razvan Pascanu, Nando De Freitas, Caglar Gulcehre

Viaarxiv icon

The Edge-of-Reach Problem in Offline Model-Based Reinforcement Learning

Add code
Bookmark button
Alert button
Feb 19, 2024
Anya Sims, Cong Lu, Yee Whye Teh

Viaarxiv icon

Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI

Add code
Bookmark button
Alert button
Feb 06, 2024
Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, Jose Miguel Hernandez Lobato, Aliaksandr Hubin, Alexander Immer, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Agustinus Kristiadi, Yingzhen Li, Stephan Mandt, Christopher Nemeth, Michael A. Osborne, Tim G. J. Rudner, David Rügamer, Yee Whye Teh, Max Welling, Andrew Gordon Wilson, Ruqi Zhang

Viaarxiv icon

Continual Learning via Sequential Function-Space Variational Inference

Add code
Bookmark button
Alert button
Dec 28, 2023
Tim G. J. Rudner, Freddie Bickford Smith, Qixuan Feng, Yee Whye Teh, Yarin Gal

Viaarxiv icon

Tractable Function-Space Variational Inference in Bayesian Neural Networks

Add code
Bookmark button
Alert button
Dec 28, 2023
Tim G. J. Rudner, Zonghao Chen, Yee Whye Teh, Yarin Gal

Viaarxiv icon