Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Topic:Jet Tagging

JetFormer: A Scalable and Efficient Transformer for Jet Tagging from Offline Analysis to FPGA Triggers

Jan 23, 2026

Ruoqing Zheng, Chang Sun, Qibin Liu, Lauri Laatu, Arianna Cox, Benedikt Maier, Alexander Tapper, Jose G. F. Coutinho, Wayne Luk, Zhiqiang Que

Abstract:We present JetFormer, a versatile and scalable encoder-only Transformer architecture for particle jet tagging at the Large Hadron Collider (LHC). Unlike prior approaches that are often tailored to specific deployment regimes, JetFormer is designed to operate effectively across the full spectrum of jet tagging scenarios, from high-accuracy offline analysis to ultra-low-latency online triggering. The model processes variable-length sets of particle features without relying on input of explicit pairwise interactions, yet achieves competitive or superior performance compared to state-of-the-art methods. On the large-scale JetClass dataset, a large-scale JetFormer matches the accuracy of the interaction-rich ParT model (within 0.7%) while using 37.4% fewer FLOPs, demonstrating its computational efficiency and strong generalization. On benchmark HLS4ML 150P datasets, JetFormer consistently outperforms existing models such as MLPs, Deep Sets, and Interaction Networks by 3-4% in accuracy. To bridge the gap to hardware deployment, we further introduce a hardware-aware optimization pipeline based on multi-objective hyperparameter search, yielding compact variants like JetFormer-tiny suitable for FPGA-based trigger systems with sub-microsecond latency requirements. Through structured pruning and quantization, we show that JetFormer can be aggressively compressed with minimal accuracy loss. By unifying high-performance modeling and deployability within a single architectural framework, JetFormer provides a practical pathway for deploying Transformer-based jet taggers in both offline and online environments at the LHC. Code is available at https://github.com/walkieq/JetFormer.

* 15 pages,

Via

Access Paper or Ask Questions

Towards Tensor Network Models for Low-Latency Jet Tagging on FPGAs

Jan 15, 2026

Alberto Coppi, Ema Puljak, Lorenzo Borella, Daniel Jaschke, Enrique Rico, Maurizio Pierini, Jacopo Pazzini, Andrea Triossi, Simone Montangero

Abstract:We present a systematic study of Tensor Network (TN) models $\unicode{x2013}$ Matrix Product States (MPS) and Tree Tensor Networks (TTN) $\unicode{x2013}$ for real-time jet tagging in high-energy physics, with a focus on low-latency deployment on Field Programmable Gate Arrays (FPGAs). Motivated by the strict requirements of the HL-LHC Level-1 trigger system, we explore TNs as compact and interpretable alternatives to deep neural networks. Using low-level jet constituent features, our models achieve competitive performance compared to state-of-the-art deep learning classifiers. We investigate post-training quantization to enable hardware-efficient implementations without degrading classification performance or latency. The best-performing models are synthesized to estimate FPGA resource usage, latency, and memory occupancy, demonstrating sub-microsecond latency and supporting the feasibility of online deployment in real-time trigger systems. Overall, this study highlights the potential of TN-based models for fast and resource-efficient inference in low-latency environments.

* 10 pages, 8 figures

Via

Access Paper or Ask Questions

Vision Transformers for End-to-End Quark-Gluon Jet Classification from Calorimeter Images

Jun 17, 2025

Md Abrar Jahin, Shahriar Soudeep, Arian Rahman Aditta, M. F. Mridha, Nafiz Fahad, Md. Jakir Hossen

Abstract:Distinguishing between quark- and gluon-initiated jets is a critical and challenging task in high-energy physics, pivotal for improving new physics searches and precision measurements at the Large Hadron Collider. While deep learning, particularly Convolutional Neural Networks (CNNs), has advanced jet tagging using image-based representations, the potential of Vision Transformer (ViT) architectures, renowned for modeling global contextual information, remains largely underexplored for direct calorimeter image analysis, especially under realistic detector and pileup conditions. This paper presents a systematic evaluation of ViTs and ViT-CNN hybrid models for quark-gluon jet classification using simulated 2012 CMS Open Data. We construct multi-channel jet-view images from detector-level energy deposits (ECAL, HCAL) and reconstructed tracks, enabling an end-to-end learning approach. Our comprehensive benchmarking demonstrates that ViT-based models, notably ViT+MaxViT and ViT+ConvNeXt hybrids, consistently outperform established CNN baselines in F1-score, ROC-AUC, and accuracy, highlighting the advantage of capturing long-range spatial correlations within jet substructure. This work establishes the first systematic framework and robust performance baselines for applying ViT architectures to calorimeter image-based jet classification using public collider data, alongside a structured dataset suitable for further deep learning research in this domain.

* Accepted in Third International Workshop on Generalizing from Limited Resources in the Open World Workshop at International Joint Conference on Artificial Intelligence (IJCAI) 2025

Via

Access Paper or Ask Questions

Guided Graph Compression for Quantum Graph Neural Networks

Jun 11, 2025

Mikel Casals, Vasilis Belis, Elias F. Combarro, Eduard Alarcón, Sofia Vallecorsa, Michele Grossi

Figure 1 for Guided Graph Compression for Quantum Graph Neural Networks

Figure 2 for Guided Graph Compression for Quantum Graph Neural Networks

Figure 3 for Guided Graph Compression for Quantum Graph Neural Networks

Figure 4 for Guided Graph Compression for Quantum Graph Neural Networks

Abstract:Graph Neural Networks (GNNs) are effective for processing graph-structured data but face challenges with large graphs due to high memory requirements and inefficient sparse matrix operations on GPUs. Quantum Computing (QC) offers a promising avenue to address these issues and inspires new algorithmic approaches. In particular, Quantum Graph Neural Networks (QGNNs) have been explored in recent literature. However, current quantum hardware limits the dimension of the data that can be effectively encoded. Existing approaches either simplify datasets manually or use artificial graph datasets. This work introduces the Guided Graph Compression (GGC) framework, which uses a graph autoencoder to reduce both the number of nodes and the dimensionality of node features. The compression is guided to enhance the performance of a downstream classification task, which can be applied either with a quantum or a classical classifier. The framework is evaluated on the Jet Tagging task, a classification problem of fundamental importance in high energy physics that involves distinguishing particle jets initiated by quarks from those by gluons. The GGC is compared against using the autoencoder as a standalone preprocessing step and against a baseline classical GNN classifier. Our numerical results demonstrate that GGC outperforms both alternatives, while also facilitating the testing of novel QGNN ansatzes on realistic datasets.

Via

Access Paper or Ask Questions

Fast Jet Tagging with MLP-Mixers on FPGAs

Mar 05, 2025

Chang Sun, Jennifer Ngadiuba, Maurizio Pierini, Maria Spiropulu

Figure 1 for Fast Jet Tagging with MLP-Mixers on FPGAs

Figure 2 for Fast Jet Tagging with MLP-Mixers on FPGAs

Figure 3 for Fast Jet Tagging with MLP-Mixers on FPGAs

Figure 4 for Fast Jet Tagging with MLP-Mixers on FPGAs

Abstract:We explore the innovative use of MLP-Mixer models for real-time jet tagging and establish their feasibility on resource-constrained hardware like FPGAs. MLP-Mixers excel in processing sequences of jet constituents, achieving state-of-the-art performance on datasets mimicking Large Hadron Collider conditions. By using advanced optimization techniques such as High-Granularity Quantization and Distributed Arithmetic, we achieve unprecedented efficiency. These models match or surpass the accuracy of previous architectures, reduce hardware resource usage by up to 97%, double the throughput, and half the latency. Additionally, non-permutation-invariant architectures enable smart feature prioritization and efficient FPGA deployment, setting a new benchmark for machine learning in real-time data processing at particle colliders.

Via

Access Paper or Ask Questions

Interpreting Transformers for Jet Tagging

Dec 04, 2024

Aaron Wang, Abijith Gandrakota, Jennifer Ngadiuba, Vivekanand Sahu, Priyansh Bhatnagar, Elham E Khoda, Javier Duarte

Abstract:Machine learning (ML) algorithms, particularly attention-based transformer models, have become indispensable for analyzing the vast data generated by particle physics experiments like ATLAS and CMS at the CERN LHC. Particle Transformer (ParT), a state-of-the-art model, leverages particle-level attention to improve jet-tagging tasks, which are critical for identifying particles resulting from proton collisions. This study focuses on interpreting ParT by analyzing attention heat maps and particle-pair correlations on the $\eta$-$\phi$ plane, revealing a binary attention pattern where each particle attends to at most one other particle. At the same time, we observe that ParT shows varying focus on important particles and subjets depending on decay, indicating that the model learns traditional jet substructure observables. These insights enhance our understanding of the model's internal workings and learning process, offering potential avenues for improving the efficiency of transformer architectures in future high-energy physics applications.

* Accepted at the Machine Learning and the Physical Sciences Workshop, NeurIPS 2024

Via

Access Paper or Ask Questions

HEP-JEPA: A foundation model for collider physics using joint embedding predictive architecture

Feb 06, 2025

Jai Bardhan, Radhikesh Agrawal, Abhiram Tilak, Cyrin Neeraj, Subhadip Mitra

Figure 1 for HEP-JEPA: A foundation model for collider physics using joint embedding predictive architecture

Figure 2 for HEP-JEPA: A foundation model for collider physics using joint embedding predictive architecture

Figure 3 for HEP-JEPA: A foundation model for collider physics using joint embedding predictive architecture

Figure 4 for HEP-JEPA: A foundation model for collider physics using joint embedding predictive architecture

Abstract:We present a transformer architecture-based foundation model for tasks at high-energy particle colliders such as the Large Hadron Collider. We train the model to classify jets using a self-supervised strategy inspired by the Joint Embedding Predictive Architecture. We use the JetClass dataset containing 100M jets of various known particles to pre-train the model with a data-centric approach -- the model uses a fraction of the jet constituents as the context to predict the embeddings of the unseen target constituents. Our pre-trained model fares well with other datasets for standard classification benchmark tasks. We test our model on two additional downstream tasks: top tagging and differentiating light-quark jets from gluon jets. We also evaluate our model with task-specific metrics and baselines and compare it with state-of-the-art models in high-energy physics. Project site: https://hep-jepa.github.io/

* 11 pages, 3 figures, 8 tables. Project website: https://hep-jepa.github.io/

Via

Access Paper or Ask Questions

Lie-Equivariant Quantum Graph Neural Networks

Nov 22, 2024

Jogi Suda Neto, Roy T. Forestano, Sergei Gleyzer, Kyoungchul Kong, Konstantin T. Matchev, Katia Matcheva

Figure 1 for Lie-Equivariant Quantum Graph Neural Networks

Figure 2 for Lie-Equivariant Quantum Graph Neural Networks

Figure 3 for Lie-Equivariant Quantum Graph Neural Networks

Figure 4 for Lie-Equivariant Quantum Graph Neural Networks

Abstract:Discovering new phenomena at the Large Hadron Collider (LHC) involves the identification of rare signals over conventional backgrounds. Thus binary classification tasks are ubiquitous in analyses of the vast amounts of LHC data. We develop a Lie-Equivariant Quantum Graph Neural Network (Lie-EQGNN), a quantum model that is not only data efficient, but also has symmetry-preserving properties. Since Lorentz group equivariance has been shown to be beneficial for jet tagging, we build a Lorentz-equivariant quantum GNN for quark-gluon jet discrimination and show that its performance is on par with its classical state-of-the-art counterpart LorentzNet, making it a viable alternative to the conventional computing paradigm.

* 10 pages, 5 figures, accepted to the Machine Learning with New Compute Paradigms (MLNCP) Workshop at NeurIPS 2024

Via

Access Paper or Ask Questions

Learning Symmetry-Independent Jet Representations via Jet-Based Joint Embedding Predictive Architecture

Dec 05, 2024

Subash Katel, Haoyang Li, Zihan Zhao, Raghav Kansal, Farouk Mokhtar, Javier Duarte

Figure 1 for Learning Symmetry-Independent Jet Representations via Jet-Based Joint Embedding Predictive Architecture

Figure 2 for Learning Symmetry-Independent Jet Representations via Jet-Based Joint Embedding Predictive Architecture

Figure 3 for Learning Symmetry-Independent Jet Representations via Jet-Based Joint Embedding Predictive Architecture

Abstract:In high energy physics, self-supervised learning (SSL) methods have the potential to aid in the creation of machine learning models without the need for labeled datasets for a variety of tasks, including those related to jets -- narrow sprays of particles produced by quarks and gluons in high energy particle collisions. This study introduces an approach to learning jet representations without hand-crafted augmentations using a jet-based joint embedding predictive architecture (J-JEPA), which aims to predict various physical targets from an informative context. As our method does not require hand-crafted augmentation like other common SSL techniques, J-JEPA avoids introducing biases that could harm downstream tasks. Since different tasks generally require invariance under different augmentations, this training without hand-crafted augmentation enables versatile applications, offering a pathway toward a cross-task foundation model. We finetune the representations learned by J-JEPA for jet tagging and benchmark them against task-specific representations.

* 5 pages, 2 figures. Accepted to Machine Learning for Physical Sciences NeurIPS 2024 workshop

Via

Access Paper or Ask Questions

Quantum Rationale-Aware Graph Contrastive Learning for Jet Discrimination

Nov 03, 2024

Md Abrar Jahin, Md. Akmol Masud, M. F. Mridha, Nilanjan Dey

Figure 1 for Quantum Rationale-Aware Graph Contrastive Learning for Jet Discrimination

Figure 2 for Quantum Rationale-Aware Graph Contrastive Learning for Jet Discrimination

Figure 3 for Quantum Rationale-Aware Graph Contrastive Learning for Jet Discrimination

Figure 4 for Quantum Rationale-Aware Graph Contrastive Learning for Jet Discrimination

Abstract:In high-energy physics, particle jet tagging plays a pivotal role in distinguishing quark from gluon jets using data from collider experiments. While graph-based deep learning methods have advanced this task beyond traditional feature-engineered approaches, the complex data structure and limited labeled samples present ongoing challenges. However, existing contrastive learning (CL) frameworks struggle to leverage rationale-aware augmentations effectively, often lacking supervision signals that guide the extraction of salient features and facing computational efficiency issues such as high parameter counts. In this study, we demonstrate that integrating a quantum rationale generator (QRG) within our proposed Quantum Rationale-aware Graph Contrastive Learning (QRGCL) framework significantly enhances jet discrimination performance, reducing reliance on labeled data and capturing discriminative features. Evaluated on the quark-gluon jet dataset, QRGCL achieves an AUC score of 77.53% while maintaining a compact architecture of only 45 QRG parameters, outperforming classical, quantum, and hybrid GCL and GNN benchmarks. These results highlight QRGCL's potential to advance jet tagging and other complex classification tasks in high-energy physics, where computational efficiency and feature extraction limitations persist.

* Submitted to IEEE Transactions series journal

Via

Access Paper or Ask Questions

Topic:Jet Tagging

Papers and Code