Alert button
Picture for Richard Vencu

Richard Vencu

Alert button

DataComp: In search of the next generation of multimodal datasets

Add code
Bookmark button
Alert button
May 03, 2023
Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei Koh, Olga Saukh, Alexander Ratner, Shuran Song, Hannaneh Hajishirzi, Ali Farhadi, Romain Beaumont, Sewoong Oh, Alex Dimakis, Jenia Jitsev, Yair Carmon, Vaishaal Shankar, Ludwig Schmidt

Figure 1 for DataComp: In search of the next generation of multimodal datasets
Figure 2 for DataComp: In search of the next generation of multimodal datasets
Figure 3 for DataComp: In search of the next generation of multimodal datasets
Figure 4 for DataComp: In search of the next generation of multimodal datasets
Viaarxiv icon

LAION-5B: An open large-scale dataset for training next generation image-text models

Add code
Bookmark button
Alert button
Oct 16, 2022
Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, Patrick Schramowski, Srivatsa Kundurthy, Katherine Crowson, Ludwig Schmidt, Robert Kaczmarczyk, Jenia Jitsev

Figure 1 for LAION-5B: An open large-scale dataset for training next generation image-text models
Figure 2 for LAION-5B: An open large-scale dataset for training next generation image-text models
Figure 3 for LAION-5B: An open large-scale dataset for training next generation image-text models
Figure 4 for LAION-5B: An open large-scale dataset for training next generation image-text models
Viaarxiv icon

LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

Add code
Bookmark button
Alert button
Nov 03, 2021
Christoph Schuhmann, Richard Vencu, Romain Beaumont, Robert Kaczmarczyk, Clayton Mullis, Aarush Katta, Theo Coombes, Jenia Jitsev, Aran Komatsuzaki

Figure 1 for LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Figure 2 for LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Figure 3 for LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Figure 4 for LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Viaarxiv icon