Alert button
Picture for Samuel Weinbach

Samuel Weinbach

Alert button

Efficient Parallelization Layouts for Large-Scale Distributed Model Training

Add code
Bookmark button
Alert button
Nov 09, 2023
Johannes Hagemann, Samuel Weinbach, Konstantin Dobler, Maximilian Schall, Gerard de Melo

Viaarxiv icon

Tokenizer Choice For LLM Training: Negligible or Crucial?

Add code
Bookmark button
Alert button
Oct 18, 2023
Mehdi Ali, Michael Fromm, Klaudia Thellmann, Richard Rutmann, Max Lübbering, Johannes Leveling, Katrin Klug, Jan Ebert, Niclas Doll, Jasper Schulze Buschhoff, Charvi Jain, Alexander Arno Weber, Lena Jurkschat, Hammam Abdelwahab, Chelsea John, Pedro Ortiz Suarez, Malte Ostendorff, Samuel Weinbach, Rafet Sifa, Stefan Kesselheim, Nicolas Flores-Herr

Figure 1 for Tokenizer Choice For LLM Training: Negligible or Crucial?
Figure 2 for Tokenizer Choice For LLM Training: Negligible or Crucial?
Figure 3 for Tokenizer Choice For LLM Training: Negligible or Crucial?
Figure 4 for Tokenizer Choice For LLM Training: Negligible or Crucial?
Viaarxiv icon

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

Add code
Bookmark button
Alert button
May 24, 2023
Marco Bellagente, Manuel Brack, Hannah Teufel, Felix Friedrich, Björn Deiseroth, Constantin Eichenberg, Andrew Dai, Robert Baldock, Souradeep Nanda, Koen Oostermeijer, Andres Felipe Cruz-Salinas, Patrick Schramowski, Kristian Kersting, Samuel Weinbach

Figure 1 for MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation
Figure 2 for MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation
Figure 3 for MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation
Figure 4 for MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation
Viaarxiv icon

AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation

Add code
Bookmark button
Alert button
Jan 23, 2023
Mayukh Deb, Björn Deiseroth, Samuel Weinbach, Patrick Schramowski, Kristian Kersting

Figure 1 for AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation
Figure 2 for AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation
Figure 3 for AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation
Figure 4 for AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation
Viaarxiv icon

M-VADER: A Model for Diffusion with Multimodal Context

Add code
Bookmark button
Alert button
Dec 07, 2022
Samuel Weinbach, Marco Bellagente, Constantin Eichenberg, Andrew Dai, Robert Baldock, Souradeep Nanda, Björn Deiseroth, Koen Oostermeijer, Hannah Teufel, Andres Felipe Cruz-Salinas

Figure 1 for M-VADER: A Model for Diffusion with Multimodal Context
Figure 2 for M-VADER: A Model for Diffusion with Multimodal Context
Figure 3 for M-VADER: A Model for Diffusion with Multimodal Context
Figure 4 for M-VADER: A Model for Diffusion with Multimodal Context
Viaarxiv icon

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

Add code
Bookmark button
Alert button
Apr 14, 2022
Sid Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, USVSN Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, Samuel Weinbach

Figure 1 for GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Figure 2 for GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Figure 3 for GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Figure 4 for GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Viaarxiv icon

MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning

Add code
Bookmark button
Alert button
Dec 09, 2021
Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank

Figure 1 for MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning
Figure 2 for MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning
Figure 3 for MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning
Figure 4 for MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning
Viaarxiv icon

Domain-Level Explainability -- A Challenge for Creating Trust in Superhuman AI Strategies

Add code
Bookmark button
Alert button
Nov 12, 2020
Jonas Andrulis, Ole Meyer, Grégory Schott, Samuel Weinbach, Volker Gruhn

Figure 1 for Domain-Level Explainability -- A Challenge for Creating Trust in Superhuman AI Strategies
Figure 2 for Domain-Level Explainability -- A Challenge for Creating Trust in Superhuman AI Strategies
Viaarxiv icon