Alert button
Picture for Marc Marone

Marc Marone

Alert button

AdapterSwap: Continuous Training of LLMs with Data Removal and Access-Control Guarantees

Add code
Bookmark button
Alert button
Apr 12, 2024
William Fleshman, Aleem Khan, Marc Marone, Benjamin Van Durme

Viaarxiv icon

Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data

Add code
Bookmark button
Alert button
Apr 05, 2024
Jingyu Zhang, Marc Marone, Tianjian Li, Benjamin Van Durme, Daniel Khashabi

Viaarxiv icon

Dated Data: Tracing Knowledge Cutoffs in Large Language Models

Add code
Bookmark button
Alert button
Mar 19, 2024
Jeffrey Cheng, Marc Marone, Orion Weller, Dawn Lawrie, Daniel Khashabi, Benjamin Van Durme

Figure 1 for Dated Data: Tracing Knowledge Cutoffs in Large Language Models
Figure 2 for Dated Data: Tracing Knowledge Cutoffs in Large Language Models
Figure 3 for Dated Data: Tracing Knowledge Cutoffs in Large Language Models
Figure 4 for Dated Data: Tracing Knowledge Cutoffs in Large Language Models
Viaarxiv icon

StarCoder 2 and The Stack v2: The Next Generation

Add code
Bookmark button
Alert button
Feb 29, 2024
Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo, Evgenii Zheltonozhskii, Nii Osae Osae Dade, Wenhao Yu, Lucas Krauß, Naman Jain, Yixuan Su, Xuanli He, Manan Dey, Edoardo Abati, Yekun Chai, Niklas Muennighoff, Xiangru Tang, Muhtasham Oblokulov, Christopher Akiki, Marc Marone, Chenghao Mou, Mayank Mishra, Alex Gu, Binyuan Hui, Tri Dao, Armel Zebaze, Olivier Dehaene, Nicolas Patry, Canwen Xu, Julian McAuley, Han Hu, Torsten Scholak, Sebastien Paquet, Jennifer Robinson, Carolyn Jane Anderson, Nicolas Chapados, Mostofa Patwary, Nima Tajbakhsh, Yacine Jernite, Carlos Muñoz Ferrandis, Lingming Zhang, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

Viaarxiv icon

"According to ..." Prompting Language Models Improves Quoting from Pre-Training Data

Add code
Bookmark button
Alert button
May 22, 2023
Orion Weller, Marc Marone, Nathaniel Weir, Dawn Lawrie, Daniel Khashabi, Benjamin Van Durme

Figure 1 for "According to ..." Prompting Language Models Improves Quoting from Pre-Training Data
Figure 2 for "According to ..." Prompting Language Models Improves Quoting from Pre-Training Data
Figure 3 for "According to ..." Prompting Language Models Improves Quoting from Pre-Training Data
Figure 4 for "According to ..." Prompting Language Models Improves Quoting from Pre-Training Data
Viaarxiv icon

StarCoder: may the source be with you!

Add code
Bookmark button
Alert button
May 09, 2023
Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Murthy, Jason Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, Manan Dey, Zhihan Zhang, Nour Fahmy, Urvashi Bhattacharyya, Wenhao Yu, Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Schoelkopf, Jan Ebert, Tri Dao, Mayank Mishra, Alex Gu, Jennifer Robinson, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

Figure 1 for StarCoder: may the source be with you!
Figure 2 for StarCoder: may the source be with you!
Figure 3 for StarCoder: may the source be with you!
Figure 4 for StarCoder: may the source be with you!
Viaarxiv icon

Data Portraits: Recording Foundation Model Training Data

Add code
Bookmark button
Alert button
Mar 06, 2023
Marc Marone, Benjamin Van Durme

Figure 1 for Data Portraits: Recording Foundation Model Training Data
Figure 2 for Data Portraits: Recording Foundation Model Training Data
Figure 3 for Data Portraits: Recording Foundation Model Training Data
Figure 4 for Data Portraits: Recording Foundation Model Training Data
Viaarxiv icon

Pretrained Models for Multilingual Federated Learning

Add code
Bookmark button
Alert button
Jun 06, 2022
Orion Weller, Marc Marone, Vladimir Braverman, Dawn Lawrie, Benjamin Van Durme

Figure 1 for Pretrained Models for Multilingual Federated Learning
Figure 2 for Pretrained Models for Multilingual Federated Learning
Figure 3 for Pretrained Models for Multilingual Federated Learning
Figure 4 for Pretrained Models for Multilingual Federated Learning
Viaarxiv icon

Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction

Add code
Bookmark button
Alert button
Sep 14, 2021
Mahsa Yarmohammadi, Shijie Wu, Marc Marone, Haoran Xu, Seth Ebner, Guanghui Qin, Yunmo Chen, Jialiang Guo, Craig Harman, Kenton Murray, Aaron Steven White, Mark Dredze, Benjamin Van Durme

Figure 1 for Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction
Figure 2 for Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction
Figure 3 for Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction
Figure 4 for Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction
Viaarxiv icon

Character Eyes: Seeing Language through Character-Level Taggers

Add code
Bookmark button
Alert button
Mar 12, 2019
Yuval Pinter, Marc Marone, Jacob Eisenstein

Figure 1 for Character Eyes: Seeing Language through Character-Level Taggers
Figure 2 for Character Eyes: Seeing Language through Character-Level Taggers
Figure 3 for Character Eyes: Seeing Language through Character-Level Taggers
Figure 4 for Character Eyes: Seeing Language through Character-Level Taggers
Viaarxiv icon