Abstract:Quantum machine learning is often motivated by the exponentially large state space of quantum systems, but this promise leaves a basic generalization problem unresolved: how can a learner assign different meanings to unseen quantum directions when the training data provide no preferred basis, measurement frame, or other orienting structure? We address this identifiability problem by formulating supervised learning without an external quantum reference frame, so that predictions cannot depend on an arbitrary choice of Hilbert-space coordinates. This requirement forces the learned classifier to preserve every unitary symmetry left unbroken by the training data. We prove that whenever the training states fail to span the full Hilbert space, all pure states orthogonal to their span must receive the same prediction -- even when those states are mutually orthogonal and perfectly distinguishable once an appropriate measurement is supplied. The limitation is therefore not caused by state discrimination, optimization, or computational power, but by missing reference information. We further establish a robust version under weak symmetry breaking and show that learning generic unstructured concepts on multiqubit systems requires exponentially many independently oriented training directions. Numerical illustrations visualize the resulting prediction collapse and its controlled relaxation. Our results identify feature maps, measurement bases, Hamiltonians, locality, symmetry priors, architectures, and sufficiently diverse training states as operational resources for generalization. The central implication is that Hilbert-space dimension alone is not a learnable feature space: successful QML must specify the physical structure that gives unseen quantum directions semantic meaning.
Abstract:A central principle in quantum machine learning is that an ansatz should be expressive enough to represent the quantum data of interest. Yet, the expressibility is statistically meaningful only insofar as it can be learned from finitely many copies of an unknown quantum state. In this work, we develop an information-theoretic Occam theory for quantum data generated by finite-size quantum circuits. For the class $S_{n,G}$ of $n$-qubit pure states preparable with at most $G$ two-qubit gates, a metric-entropy argument gives the realizable sample law $\widetildeΘ(G/ε^2)$ in the circuit-limited regime. For an arbitrary source $\hatρ$, we introduce the best $G$-gate approximation error $d_G(\hatρ)$ and the approximate circuit complexity $C_η(\hatρ)$. We prove an agnostic quantum Occam theorem: with $M$ copies, one can learn up to the best $G$-gate approximation error plus a statistical penalty $\widetilde{O}(\sqrt{G/M})$. We then remove the need to know $G$ in advance through an adaptive model-selection theorem whose oracle inequality selects the circuit complexity justified by the data. Matching lower bounds yield a sample-supported expressibility law: at trace-distance accuracy $ε$, $M$ samples can support only $G_{\rm supported} \simeq Mε^2$ gates, up to logarithmic factors and tomography saturation at $2^n$. Thus, the circuit complexity becomes an adaptive statistical resource rather than a static promise. Our framework turns bounded circuit complexity into a model-selection principle for quantum machine learning.
Abstract:Query-separated computation forces a representation to play an operational role: data are encoded before a query is known, and a later decoder can answer only through the intermediate interface. In this regime the representation functions as a message rather than merely as a feature map. We formalize this observation by embedding information causality (IC) into representation learning, obtaining a framework called neural information causality (Neural-IC). The revised formulation separates two logically distinct statements. First, every query-separated architecture induces a random-access communication experiment and obeys the embedding inequality $I_{\mathrm{N\text{-}RAC}}\le I(\vec a:H,B)$. Second, any independently certified physical capacity bound on the interface, such as a hard $m$-bit alphabet, a finite-precision register, or a power-constrained noisy channel, implies $I_{\mathrm{N\text{-}RAC}}\le C_H$. This separation avoids treating capacity as a post hoc definition and makes Neural-IC an operational diagnostic for query leakage, precision leakage, and episode-specific memory. We also provide an exact one-bit classical RAC benchmark, showing explicitly that the relevant quantum enhancement is not total information beyond the bottleneck, but fair query-conditioned access. For CHSH-type correlation layers, nested Neural-RAC protocols multiply correlation biases across depth; requiring stability of a one-bit bottleneck for arbitrary depth selects the Tsirelson threshold. We extend the analysis to asymmetric seed biases, to multi-capacity finite-depth phase diagrams, and to correlated data via a conditional information score. Controlled simulations, including straight-through binary bottlenecks and deliberately leaky ablations, verify that apparent violations are accounted for by broken query separation or undercounted capacity.
Abstract:Beyond binary classification, learnability can become a logically fragile notion: in EMX, even the class of all finite subsets of $[0,1]$ is learnable in some models of ZFC and not in others. We argue the paradox is operational. The standard definitions quantify over arbitrary set-theoretic learners that implicitly assume non-operational resources (infinite precision, unphysical data access, and non-representable outputs). We introduce physics-aware learnability (PL), which defines the learnability relative to an explicit access model -- a family of admissible physical protocols. Finite-precision coarse-graining reduces continuum EMX to a countable problem, via an exact pushforward/pullback reduction that preserves the EMX objective, making the independence example provably learnable with explicit $(ε,δ)$ sample complexity. For quantum data, admissible learners are exactly POVMs on $d$ copies, turning sample size into copy complexity and yielding Helstrom(-type) lower bounds. For finite no-signaling and quantum models, PL feasibility becomes linear or semidefinite and is therefore decidable.
Abstract:The learner's ability to generate a hypothesis that closely approximates the target function is crucial in machine learning. Achieving this requires sufficient data; however, unauthorized access by an eavesdropping learner can lead to security risks. Thus, it is important to ensure the performance of the "authorized" learner by limiting the quality of the training data accessible to eavesdroppers. Unlike previous studies focusing on encryption or access controls, we provide a theorem to ensure superior learning outcomes exclusively for the authorized learner with quantum label encoding. In this context, we use the probably-approximately-correct (PAC) learning framework and introduce the concept of learning probability to quantitatively assess learner performance. Our theorem allows the condition that, given a training dataset, an authorized learner is guaranteed to achieve a certain quality of learning outcome, while eavesdroppers are not. Notably, this condition can be constructed based only on the authorized-learning-only measurable quantities of the training data, i.e., its size and noise degree. We validate our theoretical proofs and predictions through convolutional neural networks (CNNs) image classification learning.