Alert button

Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness

Add code
Bookmark button
Alert button
Jun 16, 2023
Eric Zelikman, Qian Huang, Percy Liang, Nick Haber, Noah D. Goodman

Figure 1 for Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness
Figure 2 for Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness
Figure 3 for Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: