Alert button
Picture for Thomas Wolf

Thomas Wolf

Alert button

Training Transformers Together

Add code
Bookmark button
Alert button
Jul 07, 2022
Alexander Borzunov, Max Ryabinin, Tim Dettmers, Quentin Lhoest, Lucile Saulnier, Michael Diskin, Yacine Jernite, Thomas Wolf

Figure 1 for Training Transformers Together
Figure 2 for Training Transformers Together
Viaarxiv icon

Multitask Prompted Training Enables Zero-Shot Task Generalization

Add code
Bookmark button
Alert button
Oct 15, 2021
Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Stella Biderman, Leo Gao, Tali Bers, Thomas Wolf, Alexander M. Rush

Figure 1 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 2 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 3 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 4 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Viaarxiv icon

Datasets: A Community Library for Natural Language Processing

Add code
Bookmark button
Alert button
Sep 07, 2021
Quentin Lhoest, Albert Villanova del Moral, Yacine Jernite, Abhishek Thakur, Patrick von Platen, Suraj Patil, Julien Chaumond, Mariama Drame, Julien Plu, Lewis Tunstall, Joe Davison, Mario Šaško, Gunjan Chhablani, Bhavitvya Malik, Simon Brandeis, Teven Le Scao, Victor Sanh, Canwen Xu, Nicolas Patry, Angelina McMillan-Major, Philipp Schmid, Sylvain Gugger, Clément Delangue, Théo Matussière, Lysandre Debut, Stas Bekman, Pierric Cistac, Thibault Goehringer, Victor Mustar, François Lagunas, Alexander M. Rush, Thomas Wolf

Figure 1 for Datasets: A Community Library for Natural Language Processing
Figure 2 for Datasets: A Community Library for Natural Language Processing
Figure 3 for Datasets: A Community Library for Natural Language Processing
Viaarxiv icon

VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning

Add code
Bookmark button
Alert button
Jun 21, 2021
Hao Tan, Jie Lei, Thomas Wolf, Mohit Bansal

Figure 1 for VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning
Figure 2 for VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning
Figure 3 for VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning
Figure 4 for VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning
Viaarxiv icon

Distributed Deep Learning in Open Collaborations

Add code
Bookmark button
Alert button
Jun 18, 2021
Michael Diskin, Alexey Bukhtiyarov, Max Ryabinin, Lucile Saulnier, Quentin Lhoest, Anton Sinitsin, Dmitry Popov, Dmitry Pyrkin, Maxim Kashirin, Alexander Borzunov, Albert Villanova del Moral, Denis Mazur, Ilia Kobelev, Yacine Jernite, Thomas Wolf, Gennady Pekhimenko

Figure 1 for Distributed Deep Learning in Open Collaborations
Figure 2 for Distributed Deep Learning in Open Collaborations
Figure 3 for Distributed Deep Learning in Open Collaborations
Figure 4 for Distributed Deep Learning in Open Collaborations
Viaarxiv icon

Learning from others' mistakes: Avoiding dataset biases without modeling them

Add code
Bookmark button
Alert button
Dec 02, 2020
Victor Sanh, Thomas Wolf, Yonatan Belinkov, Alexander M. Rush

Figure 1 for Learning from others' mistakes: Avoiding dataset biases without modeling them
Figure 2 for Learning from others' mistakes: Avoiding dataset biases without modeling them
Figure 3 for Learning from others' mistakes: Avoiding dataset biases without modeling them
Figure 4 for Learning from others' mistakes: Avoiding dataset biases without modeling them
Viaarxiv icon

Movement Pruning: Adaptive Sparsity by Fine-Tuning

Add code
Bookmark button
Alert button
May 15, 2020
Victor Sanh, Thomas Wolf, Alexander M. Rush

Figure 1 for Movement Pruning: Adaptive Sparsity by Fine-Tuning
Figure 2 for Movement Pruning: Adaptive Sparsity by Fine-Tuning
Figure 3 for Movement Pruning: Adaptive Sparsity by Fine-Tuning
Figure 4 for Movement Pruning: Adaptive Sparsity by Fine-Tuning
Viaarxiv icon

TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation

Add code
Bookmark button
Alert button
Apr 09, 2020
Shaojie Jiang, Thomas Wolf, Christof Monz, Maarten de Rijke

Figure 1 for TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation
Figure 2 for TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation
Figure 3 for TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation
Figure 4 for TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation
Viaarxiv icon

HuggingFace's Transformers: State-of-the-art Natural Language Processing

Add code
Bookmark button
Alert button
Oct 16, 2019
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Jamie Brew

Figure 1 for HuggingFace's Transformers: State-of-the-art Natural Language Processing
Viaarxiv icon