Alert button
Picture for Dingwen Tao

Dingwen Tao

Alert button

Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors

Add code
Bookmark button
Alert button
Sep 29, 2023
Chengming Zhang, Baixi Sun, Xiaodong Yu, Zhen Xie, Weijian Zheng, Kamil Iskra, Pete Beckman, Dingwen Tao

Viaarxiv icon

HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs

Add code
Bookmark button
Alert button
May 03, 2023
Chengming Zhang, Shaden Smith, Baixi Sun, Jiannan Tian, Jonathan Soifer, Xiaodong Yu, Shuaiwen Leon Song, Yuxiong He, Dingwen Tao

Figure 1 for HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
Figure 2 for HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
Figure 3 for HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
Figure 4 for HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
Viaarxiv icon

HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks

Add code
Bookmark button
Alert button
Jan 20, 2023
Jinqi Xiao, Chengming Zhang, Yu Gong, Miao Yin, Yang Sui, Lizhi Xiang, Dingwen Tao, Bo Yuan

Figure 1 for HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks
Figure 2 for HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks
Figure 3 for HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks
Figure 4 for HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks
Viaarxiv icon

SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates

Add code
Bookmark button
Alert button
Nov 04, 2022
Baixi Sun, Xiaodong Yu, Chengming Zhang, Jiannan Tian, Sian Jin, Kamil Iskra, Tao Zhou, Tekin Bicer, Pete Beckman, Dingwen Tao

Figure 1 for SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates
Figure 2 for SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates
Figure 3 for SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates
Figure 4 for SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates
Viaarxiv icon

H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture

Add code
Bookmark button
Alert button
Jun 28, 2022
Chengming Zhang, Tong Geng, Anqi Guo, Jiannan Tian, Martin Herbordt, Ang Li, Dingwen Tao

Figure 1 for H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture
Figure 2 for H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture
Figure 3 for H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture
Figure 4 for H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture
Viaarxiv icon

COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression

Add code
Bookmark button
Alert button
Nov 18, 2021
Sian Jin, Chengming Zhang, Xintong Jiang, Yunhe Feng, Hui Guan, Guanpeng Li, Shuaiwen Leon Song, Dingwen Tao

Figure 1 for COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression
Figure 2 for COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression
Figure 3 for COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression
Figure 4 for COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression
Viaarxiv icon

Exploring Autoencoder-Based Error-Bounded Compression for Scientific Data

Add code
Bookmark button
Alert button
May 25, 2021
Jinyang Liu, Sheng Di, Kai Zhao, Sian Jin, Dingwen Tao, Xin Liang, Zizhong Chen, Franck Cappello

Figure 1 for Exploring Autoencoder-Based Error-Bounded Compression for Scientific Data
Figure 2 for Exploring Autoencoder-Based Error-Bounded Compression for Scientific Data
Figure 3 for Exploring Autoencoder-Based Error-Bounded Compression for Scientific Data
Figure 4 for Exploring Autoencoder-Based Error-Bounded Compression for Scientific Data
Viaarxiv icon

An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning

Add code
Bookmark button
Alert button
Nov 20, 2020
Chengming Zhang, Geng Yuan, Wei Niu, Jiannan Tian, Sian Jin, Donglin Zhuang, Zhe Jiang, Yanzhi Wang, Bin Ren, Shuaiwen Leon Song, Dingwen Tao

Figure 1 for An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning
Figure 2 for An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning
Figure 3 for An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning
Figure 4 for An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning
Viaarxiv icon

A Novel Memory-Efficient Deep Learning Training Framework via Error-Bounded Lossy Compression

Add code
Bookmark button
Alert button
Nov 18, 2020
Sian Jin, Guanpeng Li, Shuaiwen Leon Song, Dingwen Tao

Figure 1 for A Novel Memory-Efficient Deep Learning Training Framework via Error-Bounded Lossy Compression
Figure 2 for A Novel Memory-Efficient Deep Learning Training Framework via Error-Bounded Lossy Compression
Figure 3 for A Novel Memory-Efficient Deep Learning Training Framework via Error-Bounded Lossy Compression
Figure 4 for A Novel Memory-Efficient Deep Learning Training Framework via Error-Bounded Lossy Compression
Viaarxiv icon