Alert button
Picture for Joel Hestness

Joel Hestness

Alert button

MediSwift: Efficient Sparse Pre-trained Biomedical Language Models

Add code
Bookmark button
Alert button
Mar 01, 2024
Vithursan Thangarasa, Mahmoud Salem, Shreyas Saxena, Kevin Leong, Joel Hestness, Sean Lie

Figure 1 for MediSwift: Efficient Sparse Pre-trained Biomedical Language Models
Figure 2 for MediSwift: Efficient Sparse Pre-trained Biomedical Language Models
Figure 3 for MediSwift: Efficient Sparse Pre-trained Biomedical Language Models
Figure 4 for MediSwift: Efficient Sparse Pre-trained Biomedical Language Models
Viaarxiv icon

Position Interpolation Improves ALiBi Extrapolation

Add code
Bookmark button
Alert button
Oct 18, 2023
Faisal Al-Khateeb, Nolan Dey, Daria Soboleva, Joel Hestness

Viaarxiv icon

BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model

Add code
Bookmark button
Alert button
Sep 20, 2023
Nolan Dey, Daria Soboleva, Faisal Al-Khateeb, Bowen Yang, Ribhu Pathria, Hemant Khachane, Shaheer Muhammad, Zhiming, Chen, Robert Myers, Jacob Robert Steeves, Natalia Vassilieva, Marvin Tom, Joel Hestness

Figure 1 for BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
Figure 2 for BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
Figure 3 for BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
Figure 4 for BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
Viaarxiv icon

SlimPajama-DC: Understanding Data Combinations for LLM Training

Add code
Bookmark button
Alert button
Sep 19, 2023
Zhiqiang Shen, Tianhua Tao, Liqun Ma, Willie Neiswanger, Joel Hestness, Natalia Vassilieva, Daria Soboleva, Eric Xing

Viaarxiv icon

Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster

Add code
Bookmark button
Alert button
Apr 06, 2023
Nolan Dey, Gurpreet Gosal, Zhiming, Chen, Hemant Khachane, William Marshall, Ribhu Pathria, Marvin Tom, Joel Hestness

Figure 1 for Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
Figure 2 for Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
Figure 3 for Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
Figure 4 for Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
Viaarxiv icon

RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network

Add code
Bookmark button
Alert button
Jun 28, 2022
Vitaliy Chiley, Vithursan Thangarasa, Abhay Gupta, Anshul Samar, Joel Hestness, Dennis DeCoste

Figure 1 for RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network
Figure 2 for RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network
Figure 3 for RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network
Figure 4 for RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network
Viaarxiv icon

Time Dependency, Data Flow, and Competitive Advantage

Add code
Bookmark button
Alert button
Mar 17, 2022
Ehsan Valavi, Joel Hestness, Marco Iansiti, Newsha Ardalani, Feng Zhu, Karim R. Lakhani

Figure 1 for Time Dependency, Data Flow, and Competitive Advantage
Figure 2 for Time Dependency, Data Flow, and Competitive Advantage
Figure 3 for Time Dependency, Data Flow, and Competitive Advantage
Figure 4 for Time Dependency, Data Flow, and Competitive Advantage
Viaarxiv icon

Time and the Value of Data

Add code
Bookmark button
Alert button
Mar 17, 2022
Ehsan Valavi, Joel Hestness, Newsha Ardalani, Marco Iansiti

Viaarxiv icon

Efficiently Disentangle Causal Representations

Add code
Bookmark button
Alert button
Jan 06, 2022
Yuanpeng Li, Joel Hestness, Mohamed Elhoseiny, Liang Zhao, Kenneth Church

Figure 1 for Efficiently Disentangle Causal Representations
Figure 2 for Efficiently Disentangle Causal Representations
Figure 3 for Efficiently Disentangle Causal Representations
Figure 4 for Efficiently Disentangle Causal Representations
Viaarxiv icon

Memory Efficient 3D U-Net with Reversible Mobile Inverted Bottlenecks for Brain Tumor Segmentation

Add code
Bookmark button
Alert button
Apr 21, 2021
Mihir Pendse, Vithursan Thangarasa, Vitaliy Chiley, Ryan Holmdahl, Joel Hestness, Dennis DeCoste

Figure 1 for Memory Efficient 3D U-Net with Reversible Mobile Inverted Bottlenecks for Brain Tumor Segmentation
Figure 2 for Memory Efficient 3D U-Net with Reversible Mobile Inverted Bottlenecks for Brain Tumor Segmentation
Figure 3 for Memory Efficient 3D U-Net with Reversible Mobile Inverted Bottlenecks for Brain Tumor Segmentation
Figure 4 for Memory Efficient 3D U-Net with Reversible Mobile Inverted Bottlenecks for Brain Tumor Segmentation
Viaarxiv icon