Alert button
Picture for Hari Subramoni

Hari Subramoni

Alert button

DK

The Case for Co-Designing Model Architectures with Hardware

Add code
Bookmark button
Alert button
Jan 30, 2024
Quentin Anthony, Jacob Hatef, Deepak Narayanan, Stella Biderman, Stas Bekman, Junqi Yin, Aamir Shafi, Hari Subramoni, Dhabaleswar Panda

Viaarxiv icon

Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference

Add code
Bookmark button
Alert button
Jan 17, 2024
Jinghan Yao, Quentin Anthony, Aamir Shafi, Hari Subramoni, Dhabaleswar K., Panda

Viaarxiv icon

Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference

Add code
Bookmark button
Alert button
May 24, 2023
Jinghan Yao, Nawras Alnaasan, Tian Chen, Aamir Shafi, Hari Subramoni, Dhabaleswar K., Panda

Figure 1 for Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference
Figure 2 for Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference
Figure 3 for Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference
Figure 4 for Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference
Viaarxiv icon

MCR-DL: Mix-and-Match Communication Runtime for Deep Learning

Add code
Bookmark button
Alert button
Mar 15, 2023
Quentin Anthony, Ammar Ahmad Awan, Jeff Rasley, Yuxiong He, Aamir Shafi, Mustafa Abduljabbar, Hari Subramoni, Dhabaleswar Panda

Figure 1 for MCR-DL: Mix-and-Match Communication Runtime for Deep Learning
Figure 2 for MCR-DL: Mix-and-Match Communication Runtime for Deep Learning
Figure 3 for MCR-DL: Mix-and-Match Communication Runtime for Deep Learning
Figure 4 for MCR-DL: Mix-and-Match Communication Runtime for Deep Learning
Viaarxiv icon

Performance Characterization of using Quantization for DNN Inference on Edge Devices: Extended Version

Add code
Bookmark button
Alert button
Mar 09, 2023
Hyunho Ahn, Tian Chen, Nawras Alnaasan, Aamir Shafi, Mustafa Abduljabbar, Hari Subramoni, Dhabaleswar K., Panda

Figure 1 for Performance Characterization of using Quantization for DNN Inference on Edge Devices: Extended Version
Figure 2 for Performance Characterization of using Quantization for DNN Inference on Edge Devices: Extended Version
Figure 3 for Performance Characterization of using Quantization for DNN Inference on Edge Devices: Extended Version
Figure 4 for Performance Characterization of using Quantization for DNN Inference on Edge Devices: Extended Version
Viaarxiv icon

OMB-Py: Python Micro-Benchmarks for Evaluating Performance of MPI Libraries on HPC Systems

Add code
Bookmark button
Alert button
Oct 20, 2021
Nawras Alnaasan, Arpan Jain, Aamir Shafi, Hari Subramoni, Dhabaleswar K Panda

Figure 1 for OMB-Py: Python Micro-Benchmarks for Evaluating Performance of MPI Libraries on HPC Systems
Figure 2 for OMB-Py: Python Micro-Benchmarks for Evaluating Performance of MPI Libraries on HPC Systems
Figure 3 for OMB-Py: Python Micro-Benchmarks for Evaluating Performance of MPI Libraries on HPC Systems
Figure 4 for OMB-Py: Python Micro-Benchmarks for Evaluating Performance of MPI Libraries on HPC Systems
Viaarxiv icon

Efficient MPI-based Communication for GPU-Accelerated Dask Applications

Add code
Bookmark button
Alert button
Jan 21, 2021
Aamir Shafi, Jahanzeb Maqbool Hashmi, Hari Subramoni, Dhabaleswar K. Panda

Figure 1 for Efficient MPI-based Communication for GPU-Accelerated Dask Applications
Figure 2 for Efficient MPI-based Communication for GPU-Accelerated Dask Applications
Figure 3 for Efficient MPI-based Communication for GPU-Accelerated Dask Applications
Figure 4 for Efficient MPI-based Communication for GPU-Accelerated Dask Applications
Viaarxiv icon

HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow

Add code
Bookmark button
Alert button
Nov 12, 2019
Ammar Ahmad Awan, Arpan Jain, Quentin Anthony, Hari Subramoni, Dhabaleswar K., Panda

Figure 1 for HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow
Figure 2 for HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow
Figure 3 for HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow
Figure 4 for HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow
Viaarxiv icon