Alert button
Picture for Juan Ciro

Juan Ciro

Alert button

Speech Wikimedia: A 77 Language Multilingual Speech Dataset

Add code
Bookmark button
Alert button
Aug 30, 2023
Rafael Mosquera Gómez, Julián Eusse, Juan Ciro, Daniel Galvez, Ryan Hileman, Kurt Bollacker, David Kanter

Figure 1 for Speech Wikimedia: A 77 Language Multilingual Speech Dataset
Figure 2 for Speech Wikimedia: A 77 Language Multilingual Speech Dataset
Figure 3 for Speech Wikimedia: A 77 Language Multilingual Speech Dataset
Figure 4 for Speech Wikimedia: A 77 Language Multilingual Speech Dataset
Viaarxiv icon

Adversarial Nibbler: A Data-Centric Challenge for Improving the Safety of Text-to-Image Models

Add code
Bookmark button
Alert button
May 22, 2023
Alicia Parrish, Hannah Rose Kirk, Jessica Quaye, Charvi Rastogi, Max Bartolo, Oana Inel, Juan Ciro, Rafael Mosquera, Addison Howard, Will Cukierski, D. Sculley, Vijay Janapa Reddi, Lora Aroyo

Figure 1 for Adversarial Nibbler: A Data-Centric Challenge for Improving the Safety of Text-to-Image Models
Figure 2 for Adversarial Nibbler: A Data-Centric Challenge for Improving the Safety of Text-to-Image Models
Figure 3 for Adversarial Nibbler: A Data-Centric Challenge for Improving the Safety of Text-to-Image Models
Viaarxiv icon

DataPerf: Benchmarks for Data-Centric AI Development

Add code
Bookmark button
Alert button
Jul 20, 2022
Mark Mazumder, Colby Banbury, Xiaozhe Yao, Bojan Karlaš, William Gaviria Rojas, Sudnya Diamos, Greg Diamos, Lynn He, Douwe Kiela, David Jurado, David Kanter, Rafael Mosquera, Juan Ciro, Lora Aroyo, Bilge Acun, Sabri Eyuboglu, Amirata Ghorbani, Emmett Goodman, Tariq Kane, Christine R. Kirkpatrick, Tzu-Sheng Kuo, Jonas Mueller, Tristan Thrush, Joaquin Vanschoren, Margaret Warren, Adina Williams, Serena Yeung, Newsha Ardalani, Praveen Paritosh, Ce Zhang, James Zou, Carole-Jean Wu, Cody Coleman, Andrew Ng, Peter Mattson, Vijay Janapa Reddi

Figure 1 for DataPerf: Benchmarks for Data-Centric AI Development
Figure 2 for DataPerf: Benchmarks for Data-Centric AI Development
Figure 3 for DataPerf: Benchmarks for Data-Centric AI Development
Figure 4 for DataPerf: Benchmarks for Data-Centric AI Development
Viaarxiv icon

LSH methods for data deduplication in a Wikipedia artificial dataset

Add code
Bookmark button
Alert button
Dec 10, 2021
Juan Ciro, Daniel Galvez, Tim Schlippe, David Kanter

Viaarxiv icon

The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage

Add code
Bookmark button
Alert button
Nov 17, 2021
Daniel Galvez, Greg Diamos, Juan Ciro, Juan Felipe Cerón, Keith Achorn, Anjali Gopi, David Kanter, Maximilian Lam, Mark Mazumder, Vijay Janapa Reddi

Figure 1 for The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage
Figure 2 for The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage
Figure 3 for The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage
Figure 4 for The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage
Viaarxiv icon