Alert button
Picture for Harm de Vries

Harm de Vries

Alert button

StarCoder 2 and The Stack v2: The Next Generation

Add code
Bookmark button
Alert button
Feb 29, 2024
Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo, Evgenii Zheltonozhskii, Nii Osae Osae Dade, Wenhao Yu, Lucas Krauß, Naman Jain, Yixuan Su, Xuanli He, Manan Dey, Edoardo Abati, Yekun Chai, Niklas Muennighoff, Xiangru Tang, Muhtasham Oblokulov, Christopher Akiki, Marc Marone, Chenghao Mou, Mayank Mishra, Alex Gu, Binyuan Hui, Tri Dao, Armel Zebaze, Olivier Dehaene, Nicolas Patry, Canwen Xu, Julian McAuley, Han Hu, Torsten Scholak, Sebastien Paquet, Jennifer Robinson, Carolyn Jane Anderson, Nicolas Chapados, Mostofa Patwary, Nima Tajbakhsh, Yacine Jernite, Carlos Muñoz Ferrandis, Lingming Zhang, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

Viaarxiv icon

Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models

Add code
Bookmark button
Alert button
Jan 01, 2024
Terry Yue Zhuo, Armel Zebaze, Nitchakarn Suppattarachai, Leandro von Werra, Harm de Vries, Qian Liu, Niklas Muennighoff

Viaarxiv icon

The BigCode Project Governance Card

Add code
Bookmark button
Alert button
Dec 06, 2023
BigCode collaboration, Sean Hughes, Harm de Vries, Jennifer Robinson, Carlos Muñoz Ferrandis, Loubna Ben Allal, Leandro von Werra, Jennifer Ding, Sebastien Paquet, Yacine Jernite

Viaarxiv icon

RepoFusion: Training Code Models to Understand Your Repository

Add code
Bookmark button
Alert button
Jun 19, 2023
Disha Shrivastava, Denis Kocetkov, Harm de Vries, Dzmitry Bahdanau, Torsten Scholak

Figure 1 for RepoFusion: Training Code Models to Understand Your Repository
Figure 2 for RepoFusion: Training Code Models to Understand Your Repository
Figure 3 for RepoFusion: Training Code Models to Understand Your Repository
Figure 4 for RepoFusion: Training Code Models to Understand Your Repository
Viaarxiv icon

StarCoder: may the source be with you!

Add code
Bookmark button
Alert button
May 09, 2023
Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Murthy, Jason Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, Manan Dey, Zhihan Zhang, Nour Fahmy, Urvashi Bhattacharyya, Wenhao Yu, Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Schoelkopf, Jan Ebert, Tri Dao, Mayank Mishra, Alex Gu, Jennifer Robinson, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

Figure 1 for StarCoder: may the source be with you!
Figure 2 for StarCoder: may the source be with you!
Figure 3 for StarCoder: may the source be with you!
Figure 4 for StarCoder: may the source be with you!
Viaarxiv icon

The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents

Add code
Bookmark button
Alert button
Apr 05, 2023
Xing Han Lu, Siva Reddy, Harm de Vries

Figure 1 for The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents
Figure 2 for The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents
Figure 3 for The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents
Figure 4 for The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents
Viaarxiv icon

SantaCoder: don't reach for the stars!

Add code
Bookmark button
Alert button
Jan 09, 2023
Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo, Ian Yu, Paulo Villegas, Marco Zocca, Sourab Mangrulkar, David Lansky, Huu Nguyen, Danish Contractor, Luis Villa, Jia Li, Dzmitry Bahdanau, Yacine Jernite, Sean Hughes, Daniel Fried, Arjun Guha, Harm de Vries, Leandro von Werra

Figure 1 for SantaCoder: don't reach for the stars!
Figure 2 for SantaCoder: don't reach for the stars!
Figure 3 for SantaCoder: don't reach for the stars!
Figure 4 for SantaCoder: don't reach for the stars!
Viaarxiv icon

The Stack: 3 TB of permissively licensed source code

Add code
Bookmark button
Alert button
Nov 20, 2022
Denis Kocetkov, Raymond Li, Loubna Ben Allal, Jia Li, Chenghao Mou, Carlos Muñoz Ferrandis, Yacine Jernite, Margaret Mitchell, Sean Hughes, Thomas Wolf, Dzmitry Bahdanau, Leandro von Werra, Harm de Vries

Figure 1 for The Stack: 3 TB of permissively licensed source code
Figure 2 for The Stack: 3 TB of permissively licensed source code
Figure 3 for The Stack: 3 TB of permissively licensed source code
Figure 4 for The Stack: 3 TB of permissively licensed source code
Viaarxiv icon

The Power of Prompt Tuning for Low-Resource Semantic Parsing

Add code
Bookmark button
Alert button
Oct 16, 2021
Nathan Schucher, Siva Reddy, Harm de Vries

Figure 1 for The Power of Prompt Tuning for Low-Resource Semantic Parsing
Figure 2 for The Power of Prompt Tuning for Low-Resource Semantic Parsing
Figure 3 for The Power of Prompt Tuning for Low-Resource Semantic Parsing
Figure 4 for The Power of Prompt Tuning for Low-Resource Semantic Parsing
Viaarxiv icon