Alert button
Picture for Jascha Sohl-Dickstein

Jascha Sohl-Dickstein

Alert button

Training LLMs over Neurally Compressed Text

Add code
Bookmark button
Alert button
Apr 04, 2024
Brian Lester, Jaehoon Lee, Alex Alemi, Jeffrey Pennington, Adam Roberts, Jascha Sohl-Dickstein, Noah Constant

Viaarxiv icon

The boundary of neural network trainability is fractal

Add code
Bookmark button
Alert button
Feb 09, 2024
Jascha Sohl-Dickstein

Viaarxiv icon

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Add code
Bookmark button
Alert button
Dec 22, 2023
Avi Singh, John D. Co-Reyes, Rishabh Agarwal, Ankesh Anand, Piyush Patil, Xavier Garcia, Peter J. Liu, James Harrison, Jaehoon Lee, Kelvin Xu, Aaron Parisi, Abhishek Kumar, Alex Alemi, Alex Rizkowsky, Azade Nova, Ben Adlam, Bernd Bohnet, Gamaleldin Elsayed, Hanie Sedghi, Igor Mordatch, Isabelle Simpson, Izzeddin Gur, Jasper Snoek, Jeffrey Pennington, Jiri Hron, Kathleen Kenealy, Kevin Swersky, Kshiteej Mahajan, Laura Culp, Lechao Xiao, Maxwell L. Bileschi, Noah Constant, Roman Novak, Rosanne Liu, Tris Warkentin, Yundi Qian, Yamini Bansal, Ethan Dyer, Behnam Neyshabur, Jascha Sohl-Dickstein, Noah Fiedel

Figure 1 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 2 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 3 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 4 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Viaarxiv icon

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

Add code
Bookmark button
Alert button
Nov 15, 2023
C. Daniel Freeman, Laura Culp, Aaron Parisi, Maxwell L Bileschi, Gamaleldin F Elsayed, Alex Rizkowsky, Isabelle Simpson, Alex Alemi, Azade Nova, Ben Adlam, Bernd Bohnet, Gaurav Mishra, Hanie Sedghi, Igor Mordatch, Izzeddin Gur, Jaehoon Lee, JD Co-Reyes, Jeffrey Pennington, Kelvin Xu, Kevin Swersky, Kshiteej Mahajan, Lechao Xiao, Rosanne Liu, Simon Kornblith, Noah Constant, Peter J. Liu, Roman Novak, Yundi Qian, Noah Fiedel, Jascha Sohl-Dickstein

Viaarxiv icon

Noise-Reuse in Online Evolution Strategies

Add code
Bookmark button
Alert button
Apr 21, 2023
Oscar Li, James Harrison, Jascha Sohl-Dickstein, Virginia Smith, Luke Metz

Figure 1 for Noise-Reuse in Online Evolution Strategies
Figure 2 for Noise-Reuse in Online Evolution Strategies
Figure 3 for Noise-Reuse in Online Evolution Strategies
Figure 4 for Noise-Reuse in Online Evolution Strategies
Viaarxiv icon

Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC

Add code
Bookmark button
Alert button
Feb 22, 2023
Yilun Du, Conor Durkan, Robin Strudel, Joshua B. Tenenbaum, Sander Dieleman, Rob Fergus, Jascha Sohl-Dickstein, Arnaud Doucet, Will Grathwohl

Figure 1 for Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC
Figure 2 for Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC
Figure 3 for Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC
Figure 4 for Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC
Viaarxiv icon

General-Purpose In-Context Learning by Meta-Learning Transformers

Add code
Bookmark button
Alert button
Dec 08, 2022
Louis Kirsch, James Harrison, Jascha Sohl-Dickstein, Luke Metz

Figure 1 for General-Purpose In-Context Learning by Meta-Learning Transformers
Figure 2 for General-Purpose In-Context Learning by Meta-Learning Transformers
Figure 3 for General-Purpose In-Context Learning by Meta-Learning Transformers
Figure 4 for General-Purpose In-Context Learning by Meta-Learning Transformers
Viaarxiv icon

VeLO: Training Versatile Learned Optimizers by Scaling Up

Add code
Bookmark button
Alert button
Nov 17, 2022
Luke Metz, James Harrison, C. Daniel Freeman, Amil Merchant, Lucas Beyer, James Bradbury, Naman Agrawal, Ben Poole, Igor Mordatch, Adam Roberts, Jascha Sohl-Dickstein

Figure 1 for VeLO: Training Versatile Learned Optimizers by Scaling Up
Figure 2 for VeLO: Training Versatile Learned Optimizers by Scaling Up
Figure 3 for VeLO: Training Versatile Learned Optimizers by Scaling Up
Figure 4 for VeLO: Training Versatile Learned Optimizers by Scaling Up
Viaarxiv icon

A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases

Add code
Bookmark button
Alert button
Sep 22, 2022
James Harrison, Luke Metz, Jascha Sohl-Dickstein

Figure 1 for A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases
Figure 2 for A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases
Figure 3 for A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases
Figure 4 for A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases
Viaarxiv icon

Fast Finite Width Neural Tangent Kernel

Add code
Bookmark button
Alert button
Jun 17, 2022
Roman Novak, Jascha Sohl-Dickstein, Samuel S. Schoenholz

Figure 1 for Fast Finite Width Neural Tangent Kernel
Figure 2 for Fast Finite Width Neural Tangent Kernel
Figure 3 for Fast Finite Width Neural Tangent Kernel
Figure 4 for Fast Finite Width Neural Tangent Kernel
Viaarxiv icon