Alert button
Picture for Jonathan Frankle

Jonathan Frankle

Alert button

BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text

Add code
Bookmark button
Alert button
Mar 27, 2024
Elliot Bolton, Abhinav Venigalla, Michihiro Yasunaga, David Hall, Betty Xiong, Tony Lee, Roxana Daneshjou, Jonathan Frankle, Percy Liang, Michael Carbin, Christopher D. Manning

Viaarxiv icon

MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining

Add code
Bookmark button
Alert button
Jan 16, 2024
Jacob Portes, Alex Trott, Sam Havens, Daniel King, Abhinav Venigalla, Moin Nadeem, Nikhil Sardana, Daya Khudia, Jonathan Frankle

Viaarxiv icon

Dataset Difficulty and the Role of Inductive Bias

Add code
Bookmark button
Alert button
Jan 03, 2024
Devin Kwok, Nikhil Anand, Jonathan Frankle, Gintare Karolina Dziugaite, David Rolnick

Viaarxiv icon

Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws

Add code
Bookmark button
Alert button
Dec 31, 2023
Nikhil Sardana, Jonathan Frankle

Viaarxiv icon

CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images

Add code
Bookmark button
Alert button
Oct 25, 2023
Aaron Gokaslan, A. Feder Cooper, Jasmine Collins, Landan Seguin, Austin Jacobson, Mihir Patel, Jonathan Frankle, Cory Stephenson, Volodymyr Kuleshov

Figure 1 for CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images
Figure 2 for CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images
Figure 3 for CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images
Figure 4 for CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images
Viaarxiv icon

Dynamic Masking Rate Schedules for MLM Pretraining

Add code
Bookmark button
Alert button
May 24, 2023
Zachary Ankner, Naomi Saphra, Davis Blalock, Jonathan Frankle, Matthew L. Leavitt

Figure 1 for Dynamic Masking Rate Schedules for MLM Pretraining
Figure 2 for Dynamic Masking Rate Schedules for MLM Pretraining
Figure 3 for Dynamic Masking Rate Schedules for MLM Pretraining
Figure 4 for Dynamic Masking Rate Schedules for MLM Pretraining
Viaarxiv icon

Knowledge Distillation for Efficient Sequences of Training Runs

Add code
Bookmark button
Alert button
Mar 11, 2023
Xingyu Liu, Alex Leonardi, Lu Yu, Chris Gilmer-Hill, Matthew Leavitt, Jonathan Frankle

Figure 1 for Knowledge Distillation for Efficient Sequences of Training Runs
Figure 2 for Knowledge Distillation for Efficient Sequences of Training Runs
Figure 3 for Knowledge Distillation for Efficient Sequences of Training Runs
Figure 4 for Knowledge Distillation for Efficient Sequences of Training Runs
Viaarxiv icon

The Effect of Data Dimensionality on Neural Network Prunability

Add code
Bookmark button
Alert button
Dec 01, 2022
Zachary Ankner, Alex Renda, Gintare Karolina Dziugaite, Jonathan Frankle, Tian Jin

Figure 1 for The Effect of Data Dimensionality on Neural Network Prunability
Figure 2 for The Effect of Data Dimensionality on Neural Network Prunability
Figure 3 for The Effect of Data Dimensionality on Neural Network Prunability
Figure 4 for The Effect of Data Dimensionality on Neural Network Prunability
Viaarxiv icon

Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation

Add code
Bookmark button
Alert button
Nov 01, 2022
Cody Blakeney, Jessica Zosa Forde, Jonathan Frankle, Ziliang Zong, Matthew L. Leavitt

Figure 1 for Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Figure 2 for Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Figure 3 for Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Figure 4 for Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Viaarxiv icon