Picture for Chengming Zhang

Chengming Zhang

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

Add code
Oct 11, 2023
Figure 1 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 2 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 3 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Figure 4 for DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Viaarxiv icon

Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors

Add code
Sep 29, 2023
Figure 1 for Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors
Figure 2 for Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors
Figure 3 for Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors
Figure 4 for Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors
Viaarxiv icon

DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models

Add code
Sep 25, 2023
Figure 1 for DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Figure 2 for DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Figure 3 for DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Figure 4 for DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Viaarxiv icon

PapagAI:Automated Feedback for Reflective Essays

Add code
Jul 10, 2023
Figure 1 for PapagAI:Automated Feedback for Reflective Essays
Figure 2 for PapagAI:Automated Feedback for Reflective Essays
Figure 3 for PapagAI:Automated Feedback for Reflective Essays
Figure 4 for PapagAI:Automated Feedback for Reflective Essays
Viaarxiv icon

HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs

Add code
May 03, 2023
Figure 1 for HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
Figure 2 for HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
Figure 3 for HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
Figure 4 for HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
Viaarxiv icon

HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks

Add code
Jan 20, 2023
Figure 1 for HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks
Figure 2 for HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks
Figure 3 for HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks
Figure 4 for HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks
Viaarxiv icon

SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates

Add code
Nov 04, 2022
Figure 1 for SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates
Figure 2 for SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates
Figure 3 for SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates
Figure 4 for SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates
Viaarxiv icon

H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture

Add code
Jun 28, 2022
Figure 1 for H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture
Figure 2 for H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture
Figure 3 for H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture
Figure 4 for H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture
Viaarxiv icon

COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression

Add code
Nov 18, 2021
Figure 1 for COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression
Figure 2 for COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression
Figure 3 for COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression
Figure 4 for COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression
Viaarxiv icon

Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI

Add code
Jun 18, 2021
Figure 1 for Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI
Figure 2 for Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI
Figure 3 for Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI
Figure 4 for Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI
Viaarxiv icon