Alert button
Picture for Satoshi Matsuoka

Satoshi Matsuoka

Alert button

Myths and Legends in High-Performance Computing

Jan 06, 2023
Satoshi Matsuoka, Jens Domke, Mohamed Wahib, Aleksandr Drozd, Torsten Hoefler

Figure 1 for Myths and Legends in High-Performance Computing
Figure 2 for Myths and Legends in High-Performance Computing
Figure 3 for Myths and Legends in High-Performance Computing
Viaarxiv icon

MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems

Oct 26, 2021
Steven Farrell, Murali Emani, Jacob Balma, Lukas Drescher, Aleksandr Drozd, Andreas Fink, Geoffrey Fox, David Kanter, Thorsten Kurth, Peter Mattson, Dawei Mu, Amit Ruhela, Kento Sato, Koichi Shirahata, Tsuguchika Tabaru, Aristeidis Tsaris, Jan Balewski, Ben Cumming, Takumi Danjo, Jens Domke, Takaaki Fukai, Naoto Fukumoto, Tatsuya Fukushi, Balazs Gerofi, Takumi Honda, Toshiyuki Imamura, Akihiko Kasagi, Kentaro Kawakami, Shuhei Kudo, Akiyoshi Kuroda, Maxime Martinasso, Satoshi Matsuoka, Henrique Mendonça, Kazuki Minami, Prabhat Ram, Takashi Sawada, Mallikarjun Shankar, Tom St. John, Akihiro Tabuchi, Venkatram Vishwanath, Mohamed Wahib, Masafumi Yamazaki, Junqi Yin

Figure 1 for MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems
Figure 2 for MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems
Figure 3 for MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems
Figure 4 for MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems
Viaarxiv icon

Scaling Distributed Deep Learning Workloads beyond the Memory Capacity with KARMA

Aug 26, 2020
Mohamed Wahib, Haoyu Zhang, Truong Thao Nguyen, Aleksandr Drozd, Jens Domke, Lingqi Zhang, Ryousei Takano, Satoshi Matsuoka

Figure 1 for Scaling Distributed Deep Learning Workloads beyond the Memory Capacity with KARMA
Figure 2 for Scaling Distributed Deep Learning Workloads beyond the Memory Capacity with KARMA
Figure 3 for Scaling Distributed Deep Learning Workloads beyond the Memory Capacity with KARMA
Figure 4 for Scaling Distributed Deep Learning Workloads beyond the Memory Capacity with KARMA
Viaarxiv icon

The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism

Jul 25, 2020
Yosuke Oyama, Naoya Maruyama, Nikoli Dryden, Erin McCarthy, Peter Harrington, Jan Balewski, Satoshi Matsuoka, Peter Nugent, Brian Van Essen

Viaarxiv icon

Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs

Dec 05, 2018
Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse, Rio Yokota, Satoshi Matsuoka

Figure 1 for Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs
Figure 2 for Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs
Figure 3 for Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs
Figure 4 for Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs
Viaarxiv icon

μ-cuDNN: Accelerating Deep Learning Frameworks with Micro-Batching

Apr 13, 2018
Yosuke Oyama, Tal Ben-Nun, Torsten Hoefler, Satoshi Matsuoka

Figure 1 for μ-cuDNN: Accelerating Deep Learning Frameworks with Micro-Batching
Figure 2 for μ-cuDNN: Accelerating Deep Learning Frameworks with Micro-Batching
Figure 3 for μ-cuDNN: Accelerating Deep Learning Frameworks with Micro-Batching
Figure 4 for μ-cuDNN: Accelerating Deep Learning Frameworks with Micro-Batching
Viaarxiv icon