Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shih-Chieh Hsu

University of Washington

Data Science and Machine Learning in Education

Jul 19, 2022

Gabriele Benelli, Thomas Y. Chen, Javier Duarte, Matthew Feickert, Matthew Graham, Lindsey Gray, Dan Hackett, Phil Harris, Shih-Chieh Hsu, Gregor Kasieczka(+13 more)

Figure 1 for Data Science and Machine Learning in Education

Abstract:The growing role of data science (DS) and machine learning (ML) in high-energy physics (HEP) is well established and pertinent given the complex detectors, large data, sets and sophisticated analyses at the heart of HEP research. Moreover, exploiting symmetries inherent in physics data have inspired physics-informed ML as a vibrant sub-field of computer science research. HEP researchers benefit greatly from materials widely available materials for use in education, training and workforce development. They are also contributing to these materials and providing software to DS/ML-related fields. Increasingly, physics departments are offering courses at the intersection of DS, ML and physics, often using curricula developed by HEP researchers and involving open software and data used in HEP. In this white paper, we explore synergies between HEP research and DS/ML education, discuss opportunities and challenges at this intersection, and propose community activities that will be mutually beneficial.

* Contribution to Snowmass 2021

Via

Access Paper or Ask Questions

Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml

Jul 01, 2022

Elham E Khoda, Dylan Rankin, Rafael Teixeira de Lima, Philip Harris, Scott Hauck, Shih-Chieh Hsu, Michael Kagan, Vladimir Loncar, Chaitanya Paikara, Richa Rao(+3 more)

Figure 1 for Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml

Figure 2 for Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml

Figure 3 for Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml

Figure 4 for Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml

Abstract:Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programmable gate arrays (FPGAs). In this paper we present an implementation of two types of recurrent neural network layers -- long short-term memory and gated recurrent unit -- within the hls4ml framework. We demonstrate that our implementation is capable of producing effective designs for both small and large models, and can be customized to meet specific design requirements for inference latencies and FPGA resources. We show the performance and synthesized designs for multiple neural networks, many of which are trained specifically for jet identification tasks at the CERN Large Hadron Collider.

* 12 pages, 6 figures, 5 tables

Via

Access Paper or Ask Questions

Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark

Jun 23, 2022

Hendrik Borras, Giuseppe Di Guglielmo, Javier Duarte, Nicolò Ghielmetti, Ben Hawks, Scott Hauck, Shih-Chieh Hsu, Ryan Kastner, Jason Liang, Andres Meza(+8 more)

Figure 1 for Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark

Figure 2 for Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark

Figure 3 for Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark

Figure 4 for Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark

Abstract:We present our development experience and recent results for the MLPerf Tiny Inference Benchmark on field-programmable gate array (FPGA) platforms. We use the open-source hls4ml and FINN workflows, which aim to democratize AI-hardware codesign of optimized neural networks on FPGAs. We present the design and implementation process for the keyword spotting, anomaly detection, and image classification benchmark tasks. The resulting hardware implementations are quantized, configurable, spatial dataflow architectures tailored for speed and efficiency and introduce new generic optimizations and common workflows developed as a part of this work. The full workflow is presented from quantization-aware training to FPGA implementation. The solutions are deployed on system-on-chip (Pynq-Z2) and pure FPGA (Arty A7-100T) platforms. The resulting submissions achieve latencies as low as 20 $\mu$s and energy consumption as low as 30 $\mu$J per inference. We demonstrate how emerging ML benchmarks on heterogeneous hardware platforms can catalyze collaboration and the development of new techniques and more accessible tools.

* 15 pages, 7 figures, Contribution to 3rd Workshop on Benchmarking Machine Learning Workloads on Emerging Hardware (MLBench) at 5th Conference on Machine Learning and Systems (MLSys)

Via

Access Paper or Ask Questions

QONNX: Representing Arbitrary-Precision Quantized Neural Networks

Jun 17, 2022

Alessandro Pappalardo, Yaman Umuroglu, Michaela Blott, Jovan Mitrevski, Ben Hawks, Nhan Tran, Vladimir Loncar, Sioni Summers, Hendrik Borras, Jules Muhizi(+4 more)

Figure 1 for QONNX: Representing Arbitrary-Precision Quantized Neural Networks

Figure 2 for QONNX: Representing Arbitrary-Precision Quantized Neural Networks

Figure 3 for QONNX: Representing Arbitrary-Precision Quantized Neural Networks

Figure 4 for QONNX: Representing Arbitrary-Precision Quantized Neural Networks

Abstract:We present extensions to the Open Neural Network Exchange (ONNX) intermediate representation format to represent arbitrary-precision quantized neural networks. We first introduce support for low precision quantization in existing ONNX-based quantization formats by leveraging integer clipping, resulting in two new backward-compatible variants: the quantized operator format with clipping and quantize-clip-dequantize (QCDQ) format. We then introduce a novel higher-level ONNX format called quantized ONNX (QONNX) that introduces three new operators -- Quant, BipolarQuant, and Trunc -- in order to represent uniform quantization. By keeping the QONNX IR high-level and flexible, we enable targeting a wider variety of platforms. We also present utilities for working with QONNX, as well as examples of its usage in the FINN and hls4ml toolchains. Finally, we introduce the QONNX model zoo to share low-precision quantized neural networks.

* 9 pages, 5 figures, Contribution to 4th Workshop on Accelerated Machine Learning (AccML) at HiPEAC 2022 Conference

Via

Access Paper or Ask Questions

Exploring the Universality of Hadronic Jet Classification

Apr 08, 2022

Kingman Cheung, Yi-Lun Chung, Shih-Chieh Hsu, Benjamin Nachman

Figure 1 for Exploring the Universality of Hadronic Jet Classification

Figure 2 for Exploring the Universality of Hadronic Jet Classification

Figure 3 for Exploring the Universality of Hadronic Jet Classification

Figure 4 for Exploring the Universality of Hadronic Jet Classification

Abstract:The modeling of jet substructure significantly differs between Parton Shower Monte Carlo (PSMC) programs. Despite this, we observe that machine learning classifiers trained on different PSMCs learn nearly the same function. This means that when these classifiers are applied to the same PSMC for testing, they result in nearly the same performance. This classifier universality indicates that a machine learning model trained on one simulation and tested on another simulation (or data) will likely be optimal. Our observations are based on detailed studies of shallow and deep neural networks applied to simulated Lorentz boosted Higgs jet tagging at the LHC.

* 25 pages, 7 figures, 7 tables

Via

Access Paper or Ask Questions

Physics Community Needs, Tools, and Resources for Machine Learning

Mar 30, 2022

Philip Harris, Erik Katsavounidis, William Patrick McCormack, Dylan Rankin, Yongbin Feng, Abhijith Gandrakota, Christian Herwig, Burt Holzman, Kevin Pedro, Nhan Tran(+11 more)

Figure 1 for Physics Community Needs, Tools, and Resources for Machine Learning

Figure 2 for Physics Community Needs, Tools, and Resources for Machine Learning

Figure 3 for Physics Community Needs, Tools, and Resources for Machine Learning

Figure 4 for Physics Community Needs, Tools, and Resources for Machine Learning

Abstract:Machine learning (ML) is becoming an increasingly important component of cutting-edge physics research, but its computational requirements present significant challenges. In this white paper, we discuss the needs of the physics community regarding ML across latency and throughput regimes, the tools and resources that offer the possibility of addressing these needs, and how these can be best utilized and accessed in the coming years.

* Contribution to Snowmass 2021, 33 pages, 5 figures

Via

Access Paper or Ask Questions

Graph Neural Networks for Charged Particle Tracking on FPGAs

Dec 03, 2021

Abdelrahman Elabd, Vesal Razavimaleki, Shi-Yu Huang, Javier Duarte, Markus Atkinson, Gage DeZoort, Peter Elmer, Jin-Xuan Hu, Shih-Chieh Hsu, Bo-Cheng Lai(+3 more)

Figure 1 for Graph Neural Networks for Charged Particle Tracking on FPGAs

Figure 2 for Graph Neural Networks for Charged Particle Tracking on FPGAs

Figure 3 for Graph Neural Networks for Charged Particle Tracking on FPGAs

Figure 4 for Graph Neural Networks for Charged Particle Tracking on FPGAs

Abstract:The determination of charged particle trajectories in collisions at the CERN Large Hadron Collider (LHC) is an important but challenging problem, especially in the high interaction density conditions expected during the future high-luminosity phase of the LHC (HL-LHC). Graph neural networks (GNNs) are a type of geometric deep learning algorithm that has successfully been applied to this task by embedding tracker data as a graph -- nodes represent hits, while edges represent possible track segments -- and classifying the edges as true or fake track segments. However, their study in hardware- or software-based trigger applications has been limited due to their large computational cost. In this paper, we introduce an automated translation workflow, integrated into a broader tool called $\texttt{hls4ml}$, for converting GNNs into firmware for field-programmable gate arrays (FPGAs). We use this translation tool to implement GNNs for charged particle tracking, trained using the TrackML challenge dataset, on FPGAs with designs targeting different graph sizes, task complexites, and latency/throughput requirements. This work could enable the inclusion of charged particle tracking GNNs at the trigger level for HL-LHC experiments.

* 26 pages, 17 figures, 1 table

Via

Access Paper or Ask Questions

Applications and Techniques for Fast Machine Learning in Science

Oct 25, 2021

Allison McCarn Deiana, Nhan Tran, Joshua Agar, Michaela Blott, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Scott Hauck, Mia Liu, Mark S. Neubauer(+77 more)

Figure 1 for Applications and Techniques for Fast Machine Learning in Science

Figure 2 for Applications and Techniques for Fast Machine Learning in Science

Figure 3 for Applications and Techniques for Fast Machine Learning in Science

Figure 4 for Applications and Techniques for Fast Machine Learning in Science

Abstract:In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlapping challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs.

* 66 pages, 13 figures, 5 tables

Via

Access Paper or Ask Questions

SPANet: Generalized Permutationless Set Assignment for Particle Physics using Symmetry Preserving Attention

Jun 07, 2021

Alexander Shmakov, Michael James Fenton, Ta-Wei Ho, Shih-Chieh Hsu, Daniel Whiteson, Pierre Baldi

Figure 1 for SPANet: Generalized Permutationless Set Assignment for Particle Physics using Symmetry Preserving Attention

Figure 2 for SPANet: Generalized Permutationless Set Assignment for Particle Physics using Symmetry Preserving Attention

Figure 3 for SPANet: Generalized Permutationless Set Assignment for Particle Physics using Symmetry Preserving Attention

Figure 4 for SPANet: Generalized Permutationless Set Assignment for Particle Physics using Symmetry Preserving Attention

Abstract:The creation of unstable heavy particles at the Large Hadron Collider is the most direct way to address some of the deepest open questions in physics. Collisions typically produce variable-size sets of observed particles which have inherent ambiguities complicating the assignment of observed particles to the decay products of the heavy particles. Current strategies for tackling these challenges in the physics community ignore the physical symmetries of the decay products and consider all possible assignment permutations and do not scale to complex configurations. Attention based deep learning methods for sequence modelling have achieved state-of-the-art performance in natural language processing, but they lack built-in mechanisms to deal with the unique symmetries found in physical set-assignment problems. We introduce a novel method for constructing symmetry-preserving attention networks which reflect the problem's natural invariances to efficiently find assignments without evaluating all permutations. This general approach is applicable to arbitrarily complex configurations and significantly outperforms current methods, improving reconstruction efficiency between 19\% - 35\% on typical benchmark problems while decreasing inference time by two to five orders of magnitude on the most complex events, making many important and previously intractable cases tractable. A full code repository containing a general library, the specific configuration used, and a complete dataset release, are avaiable at https://github.com/Alexanders101/SPANet

* submitted to NeurIPS 2021

Via

Access Paper or Ask Questions

hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices

Mar 23, 2021

Farah Fahim, Benjamin Hawks, Christian Herwig, James Hirschauer, Sergo Jindariani, Nhan Tran, Luca P. Carloni, Giuseppe Di Guglielmo, Philip Harris, Jeffrey Krupa(+20 more)

Figure 1 for hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices

Figure 2 for hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices

Figure 3 for hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices

Figure 4 for hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices

Abstract:Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-hardware codesign workflow to interpret and translate machine learning algorithms for implementation with both FPGA and ASIC technologies. We expand on previous hls4ml work by extending capabilities and techniques towards low-power implementations and increased usability: new Python APIs, quantization-aware pruning, end-to-end FPGA workflows, long pipeline kernels for low power, and new device backends include an ASIC workflow. Taken together, these and continued efforts in hls4ml will arm a new generation of domain scientists with accessible, efficient, and powerful tools for machine-learning-accelerated discovery.

* 10 pages, 8 figures, TinyML Research Symposium 2021

Via

Access Paper or Ask Questions