Abstract:Quantum computing holds great potential for solving socially relevant and computationally complex problems. Furthermore, quantum machine learning (QML) promises to rapidly improve our current machine learning capabilities. However, current noisy intermediate-scale quantum (NISQ) devices are constrained by limitations in the number of qubits and gate counts, which hinder their full capabilities. Furthermore, the design of quantum algorithms remains a laborious task, requiring significant domain expertise and time. Quantum Architecture Search (QAS) aims to streamline this process by automatically generating novel quantum circuits, reducing the need for manual intervention. In this paper, we propose a diffusion-based algorithm leveraging the LayerDAG framework to generate new quantum circuits. This method contrasts with other approaches that utilize large language models (LLMs), reinforcement learning (RL), variational autoencoders (VAE), and similar techniques. Our results demonstrate that the proposed model consistently generates 100% valid quantum circuit outputs.
Abstract:Identifying molecular properties, including side effects, is a critical yet time-consuming step in drug development. Failing to detect these side effects before regulatory submission can result in significant financial losses and production delays, and overlooking them during the regulatory review can lead to catastrophic consequences. This challenge presents an opportunity for innovative machine learning approaches, particularly hybrid quantum-classical models like the Quantum Kernel-Based Long Short-Term Memory (QK-LSTM) network. The QK-LSTM integrates quantum kernel functions into the classical LSTM framework, enabling the capture of complex, non-linear patterns in sequential data. By mapping input data into a high-dimensional quantum feature space, the QK-LSTM model reduces the need for large parameter sets, allowing for model compression without sacrificing accuracy in sequence-based tasks. Recent advancements have been made in the classical domain using augmented variations of the Simplified Molecular Line-Entry System (SMILES). However, to the best of our knowledge, no research has explored the impact of augmented SMILES in the quantum domain, nor the role of augmented Self-Referencing Embedded Strings (SELFIES) in either classical or hybrid quantum-classical settings. This study presents the first analysis of these approaches, providing novel insights into their potential for enhancing molecular property prediction and side effect identification. Results reveal that augmenting SELFIES yields in statistically significant improvements from SMILES by a 5.97% improvement for the classical domain and a 5.91% improvement for the hybrid quantum-classical domain.
Abstract:Circuit compilation, a crucial process for adapting quantum algorithms to hardware constraints, often operates as a ``black box,'' with limited visibility into the optimization techniques used by proprietary systems or advanced open-source frameworks. Due to fundamental differences in qubit technologies, efficient compiler design is an expensive process, further exposing these systems to various security threats. In this work, we take a first step toward evaluating one such challenge affecting compiler confidentiality, specifically, reverse-engineering compilation methodologies. We propose a simple ML-based framework to infer underlying optimization techniques by leveraging structural differences observed between original and compiled circuits. The motivation is twofold: (1) enhancing transparency in circuit optimization for improved cross-platform debugging and performance tuning, and (2) identifying potential intellectual property (IP)-protected optimizations employed by commercial systems. Our extensive evaluation across thousands of quantum circuits shows that a neural network performs the best in detecting optimization passes, with individual pass F1-scores reaching as high as 0.96. Thus, our initial study demonstrates the viability of this threat to compiler confidentiality and underscores the need for active research in this area.
Abstract:We introduce a novel technique that enables observation of quantum states without direct measurement, preserving them for reuse. Our method allows multiple quantum states to be observed at different points within a single circuit, one at a time, and saved into classical memory without destruction. These saved states can be accessed on demand by downstream applications, introducing a dynamic and programmable notion of quantum memory that supports modular, non-destructive quantum workflows. We propose a hardware-agnostic, machine learning-driven framework to capture non-destructive estimates, or "snapshots," of quantum states at arbitrary points within a circuit, enabling classical storage and later reconstruction, similar to memory operations in classical computing. This capability is essential for debugging, introspection, and persistent memory in quantum systems, yet remains difficult due to the no-cloning theorem and destructive measurements. Our guess-and-check approach uses fidelity estimation via the SWAP test to guide state reconstruction. We explore both gradient-based deep neural networks and gradient-free evolutionary strategies to estimate quantum states using only fidelity as the learning signal. We demonstrate a key component of our framework on IBM quantum hardware, achieving high-fidelity (approximately 1.0) reconstructions for Hadamard and other known states. In simulation, our models achieve an average fidelity of 0.999 across 100 random quantum states. This provides a pathway toward non-volatile quantum memory, enabling long-term storage and reuse of quantum information, and laying groundwork for future quantum memory architectures.
Abstract:With the growing interest in Quantum Machine Learning (QML) and the increasing availability of quantum computers through cloud providers, addressing the potential security risks associated with QML has become an urgent priority. One key concern in the QML domain is the threat of data poisoning attacks in the current quantum cloud setting. Adversarial access to training data could severely compromise the integrity and availability of QML models. Classical data poisoning techniques require significant knowledge and training to generate poisoned data, and lack noise resilience, making them ineffective for QML models in the Noisy Intermediate Scale Quantum (NISQ) era. In this work, we first propose a simple yet effective technique to measure intra-class encoder state similarity (ESS) by analyzing the outputs of encoding circuits. Leveraging this approach, we introduce a quantum indiscriminate data poisoning attack, QUID. Through extensive experiments conducted in both noiseless and noisy environments (e.g., IBM\_Brisbane's noise), across various architectures and datasets, QUID achieves up to $92\%$ accuracy degradation in model performance compared to baseline models and up to $75\%$ accuracy degradation compared to random label-flipping. We also tested QUID against state-of-the-art classical defenses, with accuracy degradation still exceeding $50\%$, demonstrating its effectiveness. This work represents the first attempt to reevaluate data poisoning attacks in the context of QML.
Abstract:Quantum machine learning (QML) is a rapidly emerging area of research, driven by the capabilities of Noisy Intermediate-Scale Quantum (NISQ) devices. With the progress in the research of QML models, there is a rise in third-party quantum cloud services to cater to the increasing demand for resources. New security concerns surface, specifically regarding the protection of intellectual property (IP) from untrustworthy service providers. One of the most pressing risks is the potential for reverse engineering (RE) by malicious actors who may steal proprietary quantum IPs such as trained parameters and QML architecture, modify them to remove additional watermarks or signatures and re-transpile them for other quantum hardware. Prior work presents a brute force approach to RE the QML parameters which takes exponential time overhead. In this paper, we introduce an autoencoder-based approach to extract the parameters from transpiled QML models deployed on untrusted third-party vendors. We experiment on multi-qubit classifiers and note that they can be reverse-engineered under restricted conditions with a mean error of order 10^-1. The amount of time taken to prepare the dataset and train the model to reverse engineer the QML circuit being of the order 10^3 seconds (which is 10^2x better than the previously reported value for 4-layered 4-qubit classifiers) makes the threat of RE highly potent, underscoring the need for continued development of effective defenses.
Abstract:Quantum machine learning (QML) is a category of algorithms that employ variational quantum circuits (VQCs) to tackle machine learning tasks. Recent discoveries have shown that QML models can effectively generalize from limited training data samples. This capability has sparked increased interest in deploying these models to address practical, real-world challenges, resulting in the emergence of Quantum Machine Learning as a Service (QMLaaS). QMLaaS represents a hybrid model that utilizes both classical and quantum computing resources. Classical computers play a crucial role in this setup, handling initial pre-processing and subsequent post-processing of data to compensate for the current limitations of quantum hardware. Since this is a new area, very little work exists to paint the whole picture of QMLaaS in the context of known security threats in the domain of classical and quantum machine learning. This SoK paper is aimed to bridge this gap by outlining the complete QMLaaS workflow, which encompasses both the training and inference phases and highlighting significant security concerns involving untrusted classical or quantum providers. QML models contain several sensitive assets, such as the model architecture, training/testing data, encoding techniques, and trained parameters. Unauthorized access to these components could compromise the model's integrity and lead to intellectual property (IP) theft. We pinpoint the critical security issues that must be considered to pave the way for a secure QMLaaS deployment.
Abstract:Quantum Machine Learning (QML) amalgamates quantum computing paradigms with machine learning models, providing significant prospects for solving complex problems. However, with the expansion of numerous third-party vendors in the Noisy Intermediate-Scale Quantum (NISQ) era of quantum computing, the security of QML models is of prime importance, particularly against reverse engineering, which could expose trained parameters and algorithms of the models. We assume the untrusted quantum cloud provider is an adversary having white-box access to the transpiled user-designed trained QML model during inference. Reverse engineering (RE) to extract the pre-transpiled QML circuit will enable re-transpilation and usage of the model for various hardware with completely different native gate sets and even different qubit technology. Such flexibility may not be obtained from the transpiled circuit which is tied to a particular hardware and qubit technology. The information about the number of parameters, and optimized values can allow further training of the QML model to alter the QML model, tamper with the watermark, and/or embed their own watermark or refine the model for other purposes. In this first effort to investigate the RE of QML circuits, we perform RE and compare the training accuracy of original and reverse-engineered Quantum Neural Networks (QNNs) of various sizes. We note that multi-qubit classifiers can be reverse-engineered under specific conditions with a mean error of order 1e-2 in a reasonable time. We also propose adding dummy fixed parametric gates in the QML models to increase the RE overhead for defense. For instance, adding 2 dummy qubits and 2 layers increases the overhead by ~1.76 times for a classifier with 2 qubits and 3 layers with a performance overhead of less than 9%. We note that RE is a very powerful attack model which warrants further efforts on defenses.
Abstract:The high expenses imposed by current quantum cloud providers, coupled with the escalating need for quantum resources, may incentivize the emergence of cheaper cloud-based quantum services from potentially untrusted providers. Deploying or hosting quantum models, such as Quantum Neural Networks (QNNs), on these untrusted platforms introduces a myriad of security concerns, with the most critical one being model theft. This vulnerability stems from the cloud provider's full access to these circuits during training and/or inference. In this work, we introduce STIQ, a novel ensemble-based strategy designed to safeguard QNNs against such cloud-based adversaries. Our method innovatively trains two distinct QNNs concurrently, hosting them on same or different platforms, in a manner that each network yields obfuscated outputs rendering the individual QNNs ineffective for adversaries operating within cloud environments. However, when these outputs are combined locally (using an aggregate function), they reveal the correct result. Through extensive experiments across various QNNs and datasets, our technique has proven to effectively masks the accuracy and losses of the individually hosted models by upto 76\%, albeit at the expense of $\leq 2\times$ increase in the total computational overhead. This trade-off, however, is a small price to pay for the enhanced security and integrity of QNNs in a cloud-based environment prone to untrusted adversaries. We also demonstrated STIQ's practical application by evaluating it on real 127-qubit IBM\_Sherbrooke hardware, showing that STIQ achieves up to 60\% obfuscation, with combined performance comparable to an unobfuscated model.
Abstract:Quantum Generative Adversarial Networks (qGANs) are at the forefront of image-generating quantum machine learning models. To accommodate the growing demand for Noisy Intermediate-Scale Quantum (NISQ) devices to train and infer quantum machine learning models, the number of third-party vendors offering quantum hardware as a service is expected to rise. This expansion introduces the risk of untrusted vendors potentially stealing proprietary information from the quantum machine learning models. To address this concern we propose a novel watermarking technique that exploits the noise signature embedded during the training phase of qGANs as a non-invasive watermark. The watermark is identifiable in the images generated by the qGAN allowing us to trace the specific quantum hardware used during training hence providing strong proof of ownership. To further enhance the security robustness, we propose the training of qGANs on a sequence of multiple quantum hardware, embedding a complex watermark comprising the noise signatures of all the training hardware that is difficult for adversaries to replicate. We also develop a machine learning classifier to extract this watermark robustly, thereby identifying the training hardware (or the suite of hardware) from the images generated by the qGAN validating the authenticity of the model. We note that the watermark signature is robust against inferencing on hardware different than the hardware that was used for training. We obtain watermark extraction accuracy of 100% and ~90% for training the qGAN on individual and multiple quantum hardware setups (and inferencing on different hardware), respectively. Since parameter evolution during training is strongly modulated by quantum noise, the proposed watermark can be extended to other quantum machine learning models as well.