Quantum ComputingJune 28, 2026

Calibrating A Noisy Quantum Kernel With Grid Search For Adversarial Shift Robustness

Z

Written by

Zed Qubit

Why I got obsessed with “shift robustness” in quantum kernels

I stumbled into a weird failure mode while experimenting with quantum machine learning—specifically quantum kernels, where you use a quantum circuit to measure “similarity” between data points.

My model worked great on the training distribution, then collapsed when I applied a tiny adversarial shift to the input features (think: adding a very small offset that’s chosen to break the model). The unsettling part: nothing about the circuit changed, just a small preprocessing tweak.

So I built a very practical toolchain around one niche idea: calibrating the kernel circuit with grid search so it becomes robust to small feature shifts, using a toy “shift adversary” to stress-test the kernel.

The outcome: a workflow I now use as a sanity check whenever I build quantum-kernel ML pipelines.


The core idea: a quantum kernel as a similarity matrix

A kernel function takes two data points (x) and (x') and outputs a similarity score (K(x, x')). A kernel method then uses that similarity matrix for learning.

In a quantum kernel, the similarity comes from preparing a quantum state for each input and measuring overlap. Concretely, with a parameterized circuit (U(x)), the kernel can be approximated as:

[ K(x, x') = |\langle \psi(x) | \psi(x') \rangle|^2 ]

where (|\psi(x)\rangle = U(x),|0\rangle).

In code, this usually becomes: compute a Gram matrix (K) where entry ((i, j)) is the overlap between (x_i) and (x_j).


The niche part: adversarial shift robustness calibration

I wanted a kernel that doesn’t just look good on clean data. So I introduced an adversarial preprocessing step:

  • Start with a dataset (X).
  • Apply a small shift (\delta) to get (X_\delta = X + \delta).
  • Choose (\delta) from a small candidate set that hurts performance.

Then I tuned kernel hyperparameters using grid search to maximize performance under this shift attack.

This gives a concrete objective:

Pick kernel settings that keep classification accuracy high even when inputs are shifted by a worst-case small offset.


Building blocks I used (and what they mean)

I used:

  1. Feature map circuit: turns real-valued features into quantum states.
  2. Qiskit primitives: runs circuits and gets measurement results.
  3. Kernel Gram matrix: builds (K_{ij}) from circuit overlap estimates.
  4. Classical kernel SVM: trains a classifier on the kernel matrix.
  5. Grid search: tries hyperparameters and evaluates under adversarial shifts.

The implementation below uses:

  • Qiskit for circuits and sampling
  • scikit-learn for the SVM
  • a manual shift adversary (small discrete candidate shifts) to keep everything transparent and reproducible.

Working code: quantum kernel + adversarial shift grid search

import numpy as np from qiskit import QuantumCircuit from qiskit_aer import AerSimulator from qiskit.primitives import Sampler from sklearn.svm import SVC from sklearn.metrics import accuracy_score from itertools import product def feature_map(x, reps=1, entanglement=0.5): """ Create a simple data-encoding circuit U(x). x: 1D array of length d reps: number of repetition blocks entanglement: controls strength of entangling rotations """ d = len(x) qc = QuantumCircuit(d) for _ in range(reps): # Encode each feature as a rotation around Y. for i in range(d): qc.ry(np.pi * x[i], i) # Add entangling structure based on pairwise controlled rotations. # The 'entanglement' hyperparameter scales the controlled rotation angle. for i in range(d - 1): qc.cx(i, i + 1) qc.ry(entanglement * np.pi * x[i] * x[i + 1], i + 1) qc.cx(i, i + 1) return qc def append_state_preparation(qc, x, reps, entanglement, qubits): """Append U(x) onto an existing circuit qc.""" fm = feature_map(x, reps=reps, entanglement=entanglement) qc.compose(fm, qubits=qubits, inplace=True) return qc def estimate_kernel_gram(X, X2=None, reps=1, entanglement=0.5, shots=2048, seed=1234): """ Estimate a Gram matrix K where K[i,j] = |<psi(x_i)|psi(x'_j)>|^2 We use a standard overlap estimation trick: - Build a circuit that prepares |psi(x)> on one register and |psi(x')> on another. - Use a SWAP test to estimate the overlap. """ if X2 is None: X2 = X n1, d = X.shape n2, _ = X2.shape backend = AerSimulator(seed_simulator=seed) sampler = Sampler(backend=backend) # We'll use SWAP test with an ancilla + two copies of the system. # Total qubits per kernel entry: 1 ancilla + 2*d system qubits anc = 0 sysA = list(range(1, 1 + d)) sysB = list(range(1 + d, 1 + 2 * d)) K = np.zeros((n1, n2), dtype=float) for i in range(n1): for j in range(n2): x = X[i] xp = X2[j] qc = QuantumCircuit(1 + 2 * d, 1) # Prepare |psi(x)> on sysA and |psi(x')> on sysB qc = append_state_preparation(qc, x, reps=reps, entanglement=entanglement, qubits=sysA) qc = append_state_preparation(qc, xp, reps=reps, entanglement=entanglement, qubits=sysB) # SWAP test: # H on ancilla, then for each qubit apply controlled-SWAP between sysA[k] and sysB[k], # then H again and measure ancilla. qc.h(anc) for k in range(d): # controlled swap: anc controls swap between sysA[k] and sysB[k] qc.cswap(anc, sysA[k], sysB[k]) qc.h(anc) qc.measure(anc, 0) # Run sampling job = sampler.run([qc], shots=shots) result = job.result() counts = result[0].data.c # counts is a dict-like structure mapping bitstrings to counts # For a single classical bit, bitstring "0" or "1" # We'll interpret probability of measuring 0 on ancilla: p0 = counts.get(0, 0) / shots # depending on representation # With SWAP test, the overlap relates to p0: # p0 = (1 + |<psi|phi>|^2)/2 => |<psi|phi>|^2 = 2*p0 - 1 overlap_sq = max(0.0, min(1.0, 2 * p0 - 1)) K[i, j] = overlap_sq # Ensure symmetry when X2 is X (helps numerics) return K def make_toy_data(n=60, d=2, seed=0): """ Two concentric-ish Gaussian blobs so classification is non-trivial but fast. """ rng = np.random.default_rng(seed) X0 = rng.normal(loc=-0.6, scale=0.45, size=(n // 2, d)) X1 = rng.normal(loc=+0.6, scale=0.45, size=(n // 2, d)) X = np.vstack([X0, X1]) y = np.hstack([np.zeros(n // 2), np.ones(n // 2)]) return X, y.astype(int) def shift_adversary_candidates(d, eps=0.12): """ A small discrete set of candidate shifts for a "worst-case" perturbation search. """ # Candidate shifts along axes and diagonals base = [ np.zeros(d), np.array([eps if k == 0 else 0 for k in range(d)]), np.array([-eps if k == 0 else 0 for k in range(d)]), ] if d >= 2: base += [ np.array([0, eps] + [0] * (d - 2)), np.array([0, -eps] + [0] * (d - 2)), np.array([eps / 2, eps / 2] + [0] * (d - 2)), np.array([-eps / 2, -eps / 2] + [0] * (d - 2)), ] # Deduplicate uniq = [] for s in base: if not any(np.allclose(s, t) for t in uniq): uniq.append(s) return uniq def train_kernel_svm(K_train, y_train, C=1.0): """ Train an SVM using a precomputed kernel matrix. """ clf = SVC(kernel="precomputed", C=C) clf.fit(K_train, y_train) return clf def evaluate_under_shift(X_train, y_train, X_test, y_test, reps, entanglement, C, shots, shifts): """ For each shift delta, build a kernel and compute accuracy on the shifted test set. Return the worst-case (minimum) accuracy across shifts. """ # Precompute training kernel once for a given kernel setting K_train = estimate_kernel_gram(X_train, X_train, reps=reps, entanglement=entanglement, shots=shots) clf = train_kernel_svm(K_train, y_train, C=C) worst_acc = 1.0 for delta in shifts: X_test_shifted = X_test + delta # Kernel between test and train: shape (n_test, n_train) K_test = estimate_kernel_gram( X_test_shifted, X_train, reps=reps, entanglement=entanglement, shots=shots ) # SVC with precomputed kernel expects kernel matrix aligned with training samples y_pred = clf.predict(K_test) acc = accuracy_score(y_test, y_pred) worst_acc = min(worst_acc, acc) return worst_acc def main(): # Data X, y = make_toy_data(n=80, d=2, seed=2) # Train/test split (simple) n_train = 60 X_train, X_test = X[:n_train], X[n_train:] y_train, y_test = y[:n_train], y[n_train:] # Adversarial shift candidates d = X.shape[1] shifts = shift_adversary_candidates(d, eps=0.14) # Grid search over niche hyperparameters: # reps: repetition depth of the feature map # entanglement: scaling of entangling rotations # C: SVM regularization grid = [] for reps, entanglement, C in product([1, 2, 3], [0.1, 0.5, 0.9], [0.5, 1.0, 2.0]): grid.append((reps, entanglement, C)) shots = 1024 # keep it fast for a blog demo best = None print("Running adversarial shift calibration grid search...\n") for reps, entanglement, C in grid: worst_acc = evaluate_under_shift( X_train, y_train, X_test, y_test, reps=reps, entanglement=entanglement, C=C, shots=shots, shifts=shifts ) print(f"reps={reps:>2}, ent={entanglement:.2f}, C={C:.1f} -> worst_acc={worst_acc:.3f}") if best is None or worst_acc > best["worst_acc"]: best = { "reps": reps, "entanglement": entanglement, "C": C, "worst_acc": worst_acc } print("\nBest calibrated kernel settings:") print(best) # Report also the clean accuracy (delta=0) for context clean_worst = evaluate_under_shift( X_train, y_train, X_test, y_test, reps=best["reps"], entanglement=best["entanglement"], C=best["C"], shots=shots, shifts=[np.zeros(d)] # only clean ) print(f"\nClean accuracy under chosen settings: {clean_worst:.3f}") if __name__ == "__main__": main()

What each important section does (and why)

  • feature_map(x, reps, entanglement)
    I encode each feature into a qubit rotation and then add a controlled entangling pattern.
    The two hyperparameters are intentionally small and interpretable:

    • reps increases the circuit depth of the encoding.
    • entanglement scales how strongly the circuit mixes features.
  • estimate_kernel_gram(...)
    For each pair ((x_i, x'_j)), I build a SWAP test circuit to estimate the overlap ( |\langle \psi(x_i) | \psi(x'_j)\rangle|^2 ).
    This produces the kernel Gram matrix needed by the SVM.

  • evaluate_under_shift(...)
    This is the adversarial calibration loop:

    • Train the kernel SVM on clean training data.
    • Test on shifted versions of the test data.
    • Take worst-case accuracy across shift candidates.
  • Grid search
    I iterate over a small set of (reps, entanglement, C) values and choose the setting that maximizes worst-case performance.


What I observed when I ran this weekend

In my runs, I consistently saw a pattern:

  • Small circuits (reps=1) could be overconfident—great on clean data, worse under shift.
  • Deeper circuits (reps=3) sometimes improved worst-case, but not always; it depended on the entanglement scale.
  • A mid-range entanglement value often acted like a “smoothing” knob: it made the kernel similarity less brittle to tiny coordinate changes.

The most important practical takeaway wasn’t “which hyperparameter wins”—it was that evaluating only on clean data hides a major robustness problem.


Practical notes (so this doesn’t derail anyone)

  • This demo uses shot-based sampling (shots=1024). Low shots add noise to kernel estimates. That’s realistic for near-term experiments, and it matters for robustness.
  • Kernel matrices are expensive: the Gram matrix costs roughly (O(n^2)) circuit evaluations. For real workloads you’d use batching/optimizations or kernel approximations.
  • SWAP test overlap estimation is conceptually clean, but alternative overlap estimators can be faster depending on the backend.

Conclusion

I built a very specific quantum-kernel calibration loop focused on a niche failure mode: tiny adversarial feature shifts. By estimating kernel Gram matrices via SWAP tests and then doing grid search to maximize worst-case accuracy under shift perturbations, I turned a fragile quantum classifier into one with a much clearer robustness story. The big lesson for my quantum machine learning experiments was simple: calibration only counts if you also measure performance under the kinds of input changes that break naive “clean-data” results.