Picture for Gennady Pekhimenko

Gennady Pekhimenko

Accelerating Graph Neural Networks on Real Processing-In-Memory Systems

Add code
Feb 26, 2024
Figure 1 for Accelerating Graph Neural Networks on Real Processing-In-Memory Systems
Figure 2 for Accelerating Graph Neural Networks on Real Processing-In-Memory Systems
Figure 3 for Accelerating Graph Neural Networks on Real Processing-In-Memory Systems
Figure 4 for Accelerating Graph Neural Networks on Real Processing-In-Memory Systems
Viaarxiv icon

The Synergy of Speculative Decoding and Batching in Serving Large Language Models

Add code
Oct 28, 2023
Figure 1 for The Synergy of Speculative Decoding and Batching in Serving Large Language Models
Figure 2 for The Synergy of Speculative Decoding and Batching in Serving Large Language Models
Figure 3 for The Synergy of Speculative Decoding and Batching in Serving Large Language Models
Figure 4 for The Synergy of Speculative Decoding and Batching in Serving Large Language Models
Viaarxiv icon

Speeding up Fourier Neural Operators via Mixed Precision

Add code
Jul 27, 2023
Figure 1 for Speeding up Fourier Neural Operators via Mixed Precision
Figure 2 for Speeding up Fourier Neural Operators via Mixed Precision
Figure 3 for Speeding up Fourier Neural Operators via Mixed Precision
Figure 4 for Speeding up Fourier Neural Operators via Mixed Precision
Viaarxiv icon

Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction

Add code
Oct 19, 2022
Figure 1 for Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Figure 2 for Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Figure 3 for Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Figure 4 for Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Viaarxiv icon

Hidet: Task Mapping Programming Paradigm for Deep Learning Tensor Programs

Add code
Oct 18, 2022
Figure 1 for Hidet: Task Mapping Programming Paradigm for Deep Learning Tensor Programs
Figure 2 for Hidet: Task Mapping Programming Paradigm for Deep Learning Tensor Programs
Figure 3 for Hidet: Task Mapping Programming Paradigm for Deep Learning Tensor Programs
Figure 4 for Hidet: Task Mapping Programming Paradigm for Deep Learning Tensor Programs
Viaarxiv icon

Optimizing Data Collection in Deep Reinforcement Learning

Add code
Jul 15, 2022
Figure 1 for Optimizing Data Collection in Deep Reinforcement Learning
Figure 2 for Optimizing Data Collection in Deep Reinforcement Learning
Figure 3 for Optimizing Data Collection in Deep Reinforcement Learning
Figure 4 for Optimizing Data Collection in Deep Reinforcement Learning
Viaarxiv icon

MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

Add code
Oct 08, 2021
Figure 1 for MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation
Figure 2 for MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation
Figure 3 for MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation
Figure 4 for MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation
Viaarxiv icon

Distributed Deep Learning in Open Collaborations

Add code
Jun 18, 2021
Figure 1 for Distributed Deep Learning in Open Collaborations
Figure 2 for Distributed Deep Learning in Open Collaborations
Figure 3 for Distributed Deep Learning in Open Collaborations
Figure 4 for Distributed Deep Learning in Open Collaborations
Viaarxiv icon

Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices

Add code
Mar 04, 2021
Figure 1 for Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Figure 2 for Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Figure 3 for Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Figure 4 for Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Viaarxiv icon

RL-Scope: Cross-Stack Profiling for Deep Reinforcement Learning Workloads

Add code
Mar 04, 2021
Figure 1 for RL-Scope: Cross-Stack Profiling for Deep Reinforcement Learning Workloads
Figure 2 for RL-Scope: Cross-Stack Profiling for Deep Reinforcement Learning Workloads
Figure 3 for RL-Scope: Cross-Stack Profiling for Deep Reinforcement Learning Workloads
Figure 4 for RL-Scope: Cross-Stack Profiling for Deep Reinforcement Learning Workloads
Viaarxiv icon