A hybrid quantum-classical classification model based on branching multi-scale entanglement renormalization ansatz

Yan-Yan Hou; Jian Li; Tao Xu; Xin-Yu Liu

doi:10.1038/s41598-024-69384-6

. 2024 Aug 9;14:18521. doi: 10.1038/s41598-024-69384-6

A hybrid quantum-classical classification model based on branching multi-scale entanglement renormalization ansatz

Yan-Yan Hou ¹, Jian Li ², Tao Xu ^3,^✉, Xin-Yu Liu ¹

PMCID: PMC11316021 PMID: 39122811

Abstract

Tensor networks are emerging architectures for implementing quantum classification models. The branching multi-scale entanglement renormalization ansatz (BMERA) is a tensor network known for its enhanced entanglement properties. This paper introduces a hybrid quantum-classical classification model based on BMERA and explores the correlation between circuit layout, expressiveness, and classification accuracy. Additionally, we present an autodifferentiation method for computing the cost function gradient, which serves as a viable option for other hybrid quantum-classical models. Through numerical experiments, we demonstrate the accuracy and robustness of our classification model in tasks such as image recognition and cluster excitation discrimination, offering a novel approach for designing quantum classification models.

Keywords: Tensor networks, Hybrid quantum-classical classification model, Branching multi-scale entanglement renormalization ansatz (BMERA), Quantum machine learning

Subject terms: Quantum information, Computer science

Introduction

Machine learning has made significant strides in diverse scientific and technological domains, such as image recognition and natural language processing. The rapid growth of big data and artificial intelligence has led to increased demands for improved machine learning performance. Quantum superposition and entanglement render quantum computation a superior solution for processing large-scale data. Researchers have integrated quantum computation with machine learning, resulting in a series of quantum machine learning algorithms^1–7 applicable to discriminative^8,9 and generative learning^10,11 tasks. Tensor networks (TNs) extract features by contracting tensors, serving as crucial numerical tools for analyzing quantum multi-body systems.

Recently, researchers have begun applying tensor networks (TNs) in quantum classification models. Tree tensor networks (TTN)¹², known for their logarithmic depth concerning input qubits, were initially employed in quantum classification models. Another tensor network used in quantum classification models is multi-scale entanglement renormalization ansatz (MERA). MERA shares a similar structure to TTN but incorporates isometry operations to capture entanglements across a wider scale. Research suggests that quantum machine learning models utilizing MERA outperform those utilizing TTNs in both classification and generative tasks¹³.

TNs can be regarded as quantum neural networks with specific structures, characterized by logarithmic circuit depth, which mitigates the barren plateau problem during parameter optimization¹⁴. Quantum classification models utilizing TTN or MERA leverage entanglement to capture the correlation among data features, where the entanglement entropy must satisfy the area law¹⁵. However, in some complex classification tasks, the correlation among data features may necessitate the violation of the area law by entanglement entropy. Are there TNs better suited for implementing quantum classification models? Branching multi-scale entanglement renormalization ansatz (BMERA) serves as a generalization of MERA, wherein its entanglement entropy can break the area law to support the volume law. BMERA’s favorable entanglement characteristics render it a suitable choice for quantum classification models.

Unlike general neural networks, BMERA relies on tensor contraction to process data, where the input data of local tensors cannot be sent to multiple outputs¹⁶. Consequently, quantum classification model based on BMERA is not scalable. Inspired by the entanglement advantage of BMERA and the scalability of classical neural networks, we design a hybrid quantum-classical classification model. In this model, BMERA and classical neural networks cooperates to construct the cost function of classification model. Subsequently, a classical optimizer is employed to optimize the model parameters. This classification model transfers some tasks challenging for quantum devices to classical neural networks, so quantum circuit has a shallow circuit depth and is suitable for implementation on noisy intermediate-scale quantum (NISQ) devices^17,18.

Our work contributes in three main ways. Firstly, we propose a novel parameterized quantum circuit based on BMERA, which utilizes a shallower circuit for data feature extraction. Secondly, we analyze the relationship between circuit layout, expressibility, and classification accuracy, providing valuable insights for enhancing classification accuracy. Lastly, we propose an autodifferentiation method for computing the cost function gradient, which serves as a viable option for other hybrid quantum-classical models. Simulation results demonstrate that the proposed model surpasses quantum classification models based on TTN and MERA in both MNIST (handwriting recognition with binary images) and quantum cluster state excitation discrimination tasks. Additionally, we illustrate that the proposed model exhibits robustness against depolarization noise.

This paper is organized as follows: “Method” section presents the hybrid quantum-classical classification model based on BMERA, and “Numerical simulations and discussions” section verifies the accuracy and robustness of this model in classical and quantum classification tasks. Finally, we present our conclusions and discuss future research directions.

Method

In this section, we introduce the hybrid quantum-classical classification model based on BMERA. If the input data originates from quantum processes, it is already in a quantum state and can be directly used as input for the model. For classical input data, the primary task is to encode it into quantum states. Encoding methods include amplitude encoding, qubit encoding, and hybrid encoding. Among these methods, qubit encoding methods encodes each element into the rotation angle of a single-qubit gate, resulting in a shallower circuit depth and easier implementing on NISQ devices. The core work of the hybrid quantum classical classification model involves the collaboration of parameterized quantum circuits and classical neural networks. Firstly, parameterized quantum circuits act on encoded quantum states to extract data features. Typically, the number of output qubits exceeds the qubits needed for predicting labels. Therefore, measuring the output state to obtain the expectation is necessary, which is then forwarded to classical neural networks for extracting predicted labels.

Both parameterized quantum circuit and classical neural networks require a large number of parameters. As the initial parameters are random, it is difficult for the initial classification model to be optimal. Similar to classification models implemented by classical neural networks, we introduce a classical optimizer to optimize parameterized quantum circuit and neural networks, obtaining optimal parameters through iterative optimization. It is noteworthy that the parameterized quantum circuit and classical neural networks are alternately updated during the optimization process. During each iteration, only the parameters that were not updated in the previous iteration are updated, while the other parameters remain unchanged. Similar to layer-wise learning used in quantum neural networks^19,20, alternating parameter updates are beneficial to addressing the barren plateau problem during parameter optimization process. Figure 1 shows the framework of the hybrid quantum-classical classification model based on BMERA.

The framework of the hybrid quantum-classical classification model based on BMERA. Input is classical data, whose features are extracted and mapped into quantum states (represented by circles). Then, quantum states are sent to BMERA. Each layer contains one or two sub-layers, where yellow and white rectangles represent unitary modules of the first and second sub-layers, respectively. Data dimension is reduced by partial trace operation, represented by hash marks. The expectation of the Pauli operator for the output is obtained through multiple measurement operations. Classical neural networks act on the expectation values and output predicted labels. In the optimization process, the parameters of BMERA and classical neural networks are iteratively optimized by a classical optimizer.

Map data into quantum states

Let $D = {(x_{1}, y_{1}), . . ., (x_{m}, y_{m})}$ denote training data set, where $x_{i} = (x_{i}^{1}, x_{i}^{2},$ $. . ., x_{i}^{n})$ is n-dimensional data vector, and $y_{i} \in {0, . . ., l - 1}$ indicates the corresponding label. For classification tasks, the kernel work is to build the function $f : x_{i} \to y_{i}$ , mapping data vector $x_{i}$ to the corresponding label $y_{i}$ . For qubit encoding, each element $x_{i}^{j}$ is first scaled to $[- 1, 1]$ , and then mapped into the state

\begin{matrix} \begin{matrix} \begin{matrix} | ϕ (x_{i}^{j}) ⟩ = cos (\frac{π}{2} x_{i}^{j}) | 0 ⟩ + sin (\frac{π}{2} x_{i}^{j}) | 1 ⟩ . \end{matrix} \end{matrix} \end{matrix}

$x_{i}$ corresponds to the tensor product state

\begin{matrix} \begin{matrix} \begin{matrix} | ϕ (x_{i}) ⟩ = | ϕ (x_{i}^{1}) ⟩ \otimes | ϕ (x_{i}^{2}) ⟩ \otimes . . . \otimes | ϕ (x_{i}^{n}) ⟩, \end{matrix} \end{matrix} \end{matrix}

where $| ϕ (x_{i}) ⟩$ is located in $2^{n}$ -dimensional Hilbert space. Qubit encoding can be implemented by fixed quantum circuit $P (x_{i}) = \otimes_{j = 1}^{n} P_{i}^{j}$ on the initial state ${| 0 ⟩}^{\otimes n}$ , where

\begin{matrix} \begin{matrix} \begin{matrix} P_{i}^{j} = e^{- i \frac{π}{2} x_{i}^{j} σ_{y}} = [\begin{matrix} cos (\frac{π}{2} x_{i}^{j}), & - sin (\frac{π}{2} x_{i}^{j}) \\ sin (\frac{π}{2} x_{i}^{j}), & cos (\frac{π}{2} x_{i}^{j}) \end{matrix}] . \end{matrix} \end{matrix} \end{matrix}

Parameterized quantum circuit based on BMERA

Parameterized quantum circuit, also called ansatz, plays an essential role in the hybrid quantum-classical classification model. The challenging task in implementing the hybrid quantum-classical classification model is to construct an ansatz that can represent the solution space of the problem being solved²¹. The entanglement entropy of BMERA can break the area law to support the volume law. In classification tasks with strong data correlation, BMERA can effectively represent the solution space of the problem in classification tasks with strong data correlation, so as to better approximate the classification model.

Similar to ansatzes based on general tensor networks, BMERA consists of 2-qubit unitary modules arranged in hierarchical layout. For n-qubit input, BMERA needs $L = O (l o g (n))$ layers. In the ith layer, adjacent $2^{i}$ qubits are entangled to build local correlations. Let

\begin{matrix} \begin{matrix} \begin{matrix} ρ_{0} = | ϕ (x_{i}) ⟩ ⟨ ϕ (x_{i}) | = ρ_{0}^{1} \otimes ρ_{0}^{2} \otimes . . . \otimes ρ_{0}^{n}, \end{matrix} \end{matrix} \end{matrix}

represent n-qubit input, where $ρ_{0}^{i}$ represents the state of the ith qubit. In the first layer, the unitary $U_{1} (θ)$ acts on $ρ_{0}$ and produces the entangled state

\begin{matrix} \begin{matrix} \begin{matrix} ρ_{1} = | ϕ_{1} (x_{i}) ⟩ ⟨ ϕ_{1} (x_{i}) | = ρ_{1}^{1} \otimes ρ_{1}^{2} \otimes . . . \otimes ρ_{1}^{n / 2}, \end{matrix} \end{matrix} \end{matrix}

where $ρ_{1}^{i}$ represents the entangled state of the ith adjacent qubit pair. In the t layer, $U_{t} (θ)$ acts on the output state $ρ_{t - 1}$ of the $t - 1$ layer, and produce the entangled state

\begin{matrix} \begin{matrix} \begin{matrix} ρ_{t} = | φ_{t} (x_{i}) ⟩ ⟨ φ_{t} (x_{i}) | = ρ_{t}^{1} \otimes ρ_{t}^{2} \otimes . . . \otimes ρ_{t}^{n / 2^{t}}, \end{matrix} \end{matrix} \end{matrix}

where $ρ_{t}^{i}$ represents the state entangled by $2^{t}$ adjacent qubits and $t \in {1, . . ., L}$ . As the number of circuit layers increases, BMERA gradually builds entanglement on a larger scale. In the last layer, unitary operation $U_{L} (θ)$ acts on the state $ρ_{L - 1}$ and gets the final output state $ρ_{L}$ entangling all qubits.

Figure 2 shows a BMERA with 16-qubit input, including four layers. Starting from the second layer, each layer contains two sub-layers to establish a larger range of entanglement. Each sub-layer includes multiple two-qubit unitary modules $V_{i}^{j} (θ)$ . Unitary modules $V_{i}^{j} (θ)$ are implemented by single-qubit and two-qubit unitary operations, where single-qubit unitary operations are rotation gates controlled by trainable parameters and two-qubit unitary operations are controlled gates acting on adjacent qubits.

The structure of BMERA with 16-qubit input. $ρ_{0}$ represents the input including $a_{0} \sim a_{15}$ . Unitary operation in the first layer acts on $ρ_{0}$ and ouputs the entangled state $ρ_{1}$ , where neighboring 2 qubits are entangled. Unitary operation in the second layer forms an entangled state $ρ_{2}$ , where neighboring 4 qubits are entangled. Unitary operation in the third layer forms the entangled state $ρ_{3}$ , where neighboring 8 qubits are entangled. 16 qubits are entangled through 4 layer unitary operations, and a highly entangled state $ρ_{4}$ is built. Rectangles represent unitary modules $V_{i}^{j} (θ_{i, j})$ .

Classification is equivalent to mapping the input to its predicted label vectors. Typically, the dimension of the label vector is less than the dimension of the input. Therefore, the number of output qubits should be less than the number of input qubits. BMERA establish entanglement over a larger scale, but input and output have the identical qubits, so that BMERA cannot implement shrinkage mapping from the input to the predicted label. Quantum convolutional neural network (QCNN) is an important quantum neural network model. In each layer, only some qubits are output to subsequent layer, where partial trace operation are used to reduce output qubits. The output of BMERA is obtained through multiple layers of unitary and partial trace operations, and its expectation corresponds to the extracted feature.

Causal cone is an important property of MERA, consisting of gates and connections that affect output qubits. BMERA, an extension of MERA, also has causal cone property, and output is affected by the unitary operations in its causal cone. Inspired by the contractility of QCNN and the causal cone property of BMERA, we design a reduced BMERA that only uses partial continuous qubits as outputs, and the quantum circuit retains only the unitary modules located in the causal cone where the output qbits are located.

In reduced BMERA, $υ_{i - 1}$ qubits of the $(i - 1)$ th layer remain as the input of the ith layer, and the remaining qubits are reduced by partial trace operation. The output of the ith layer can be written as $ρ_{i}^{^{'}} = t r_{{\tilde{o}}_{i}} (U_{i} (θ) ρ_{i - 1}^{^{'}} U_{i}^{†} (θ))$ , where $t r_{{\tilde{o}}_{i}}$ denotes the partial trace operation acting on the qubits other than the output qubit in the ith layer, and $ρ_{i}^{^{'}}$ and $ρ_{i - 1}^{^{'}}$ represent the input and output of the $i - 1$ layer, respectively.

Figure 3 shows a reduced BMERA with 16-qubit input. Structurally, the reduced BMERA is a binary tree. As the number of layers increases, the size of subtrees gradually increases, but the number of output qubits decreases. In the final layer, the output qubits $a_{7}$ , $a_{0}$ , $a_{15}$ , and $a_{8}$ are measured, and their expectation values are transmitted to classical neural networks. Reduced BMERA can effectively extracts the input features while retaining the entanglement of output qubits. Classical convolutional neural network (CNN) has translation invariance, and this means that the same convolutional layer adopts the shared filter weights. CNN has higher accuracy for classifying data with space correlation²². Inspired by the translation invariance of CNN, reduced BMERA adopts the same parameters for the unitary modules in the same sub-layer.

The structure of a reduced BMERA with 16-qubit input. The input $ρ_{0}^{^{'}}$ includes qubits $a_{0} \sim a_{15}$ . Each layer has multiple sub-trees, and each dotted box represents a subtree. $ρ_{0}^{^{'}}$ , $ρ_{1}^{^{'}}$ , $ρ_{2}^{^{'}}$ and $ρ_{3}^{^{'}}$ represent the inputs of the $1 \sim 4$ th layers, respectively. Partial trace operation acts on some qubits to reduce output qubits. In each layer, quantum swap operation is adopted to establish a larger scale entanglement. The final output is obtained by measuring the expectation values of Pauli operator $σ_{z}$ for the qubits $a_{7}$ , $a_{0}$ , $a_{15}$ , and $a_{8}$ .

Cost function and optimization

The critical task of the hybrid quantum-classical classification model lies in formulating the cost function relevant to the problem at hand. Cost functions can be constructed using methods such as mean squared error (MSE) and cross entropy methods. Cross-entropy method is more suitable for classification tasks, and its definition is

\begin{matrix} \begin{matrix} \begin{matrix} f_{c} (l^{x_{i}}, y_{i}) = - \frac{1}{n} \sum_{i} (y_{i} l o g (l^{x_{i}}) + (1 - y_{i}) l o g (1 - l^{x_{i}})), \end{matrix} \end{matrix} \end{matrix}

where $y_{i}$ and $l^{x_{i}}$ represent the correct and predicted labels, respectively, and n is the number of training samples. For the hybrid quantum-classical classification model, the initial step involves extracting data features using a parameterized quantum circuit.”

Let $U (θ)$ denote the parameterized quantum circuit acting on the input $| ϕ (x_{i}) ⟩$ , then the output is

\begin{matrix} \begin{matrix} \begin{matrix} | ϕ^{^{'}} (x_{i}) ⟩ = U (θ) | ϕ (x_{i}) ⟩, \end{matrix} \end{matrix} \end{matrix}

where $θ$ represents trainable parameters. Measure the expectation of Pauli operator $σ_{z}$ for the jth qubit, and get the expectation value

\begin{matrix} \begin{matrix} \begin{matrix} E^{j} (θ) = ⟨ ϕ (x_{i}) | U (θ) σ_{z}^{j} U^{†} (θ) | ϕ (x_{i}) ⟩, \end{matrix} \end{matrix} \end{matrix}

where $σ_{z}^{j}$ means the operator acting on the jth qubit.

Subsequently, the expectation values of all output qubits are used to construct the feature vector $E (θ) = {E^{1} (θ),$ $E^{2} (θ), . . . .,$ $E^{n_{o}} (θ)}$ , where $n_{o}$ denotes the number of output qubits. Finally, $E (θ)$ is transmitted to classical neural network $f_{nn} (E (θ), β)$ to get the predicted label $l^{x_{i}}$ , where $β$ denotes trainable parameters of classical neural network. Based on cross-entropy method, the cost function of the hybrid quantum-classical classification model can be written as

\begin{matrix} \begin{matrix} \begin{matrix} C (θ, β) = f_{c} (l^{x_{i}}, y_{i}) \circ f_{nn} (E (θ), β) \circ f_{tn} (| ϕ (x_{i}) ⟩, θ) = f_{c} (f_{nn} (f_{tn} (| ϕ (x_{i}) ⟩, θ), β), y_{i}), \end{matrix} \end{matrix} \end{matrix}

where $f_{tn} (| ϕ (x_{i}) ⟩, θ)$ represents the mapping from the input $| ϕ (x_{i}) ⟩$ to the expectation value $E (θ)$ , and $f_{c} (l^{x_{i}}, y_{i})$ means the cross-entropy function that maps the predicted label $l^{x_{i}}$ and the correct label $y_{i}$ to the cost function $C (θ, β)$ .

The subsequent step involves computing the optimal parameters $(θ^{*}, β^{*})$ by minimizing the cost function $C (θ, β)$ . Gradient descent is a common optimization method in machine learning. Its core concept involves updating the parameters of the cost function along the direction of gradient descent. Autodifferentiation is an effective approach for computing the gradient of a composite cost function. In this approach, the cost function is decomposed into several subfunctions, and the gradient can be computed by applying the chain rule of backpropagation to the partial derivatives of these subfunctions. Consequently, the gradient descent method involves computing a series of partial derivatives of the subfunctions²³.

Let $θ_{j}$ represent the jth parameter of $θ$ . By the derivative chain rule of backpropagation, the partial derivative of the cost function $C (θ, β)$ with respect to $E^{k} (θ)$ is

\begin{matrix} \begin{matrix} \begin{matrix} g_{k} = \frac{\partial (f_{c} (l^{x_{i}}) \cdot f_{nn} (E (θ), β))}{\partial E^{k} (θ)} = \frac{\partial f_{c} (l^{x_{i}})}{\partial l^{x_{i}}} \cdot \frac{\partial (f_{nn} (E (θ), β))}{\partial E^{k} (θ)}, \end{matrix} \end{matrix} \end{matrix}

where $\frac{\partial (f_{c} (l^{x_{i}}))}{\partial l^{x_{i}}} = \frac{1 - y_{i}}{1 - l^{x_{i}}} - \frac{y_{i}}{l^{x_{i}}}$ and $\frac{\partial (f_{nn} (E (θ), β))}{\partial E^{k} (θ)}$ are computed by classical computers. Let $g = {g_{1}, g_{2}, . . ., g_{n_{o}}}$ represent the vector consisting of partial derivative $g_{k}$ . Then, the gradient of $C (θ, β)$ with respect to the jth parameter $θ_{j}$ is

\begin{matrix} \begin{matrix} \begin{matrix} \frac{\partial (f_{c} (l^{x_{i}}) \cdot f_{nn} (E (θ), β)) \cdot f_{tn} (| ϕ (x_{i}) ⟩, θ)}{\partial θ_{j}} = \frac{\partial (g \cdot E (θ))}{\partial θ_{j}} = \sum_{k = 1}^{o} g_{k} \frac{\partial E^{k} (θ)}{\partial θ_{j}}, \end{matrix} \end{matrix} \end{matrix}

A critical aspect of minimizing the cost function involves solving the partial derivative of the expectation value $E^{k} (θ)$ with respect to $θ_{j}$ , which can be computed using the parameter shift rule²⁴

\begin{matrix} \begin{matrix} \begin{matrix} \frac{\partial E^{k} (θ)}{\partial θ_{j}} = E^{k} (θ_{j} + \frac{π}{2} Δ_{j}, {\tilde{θ}}_{j}) - E^{k} (θ_{j} - \frac{π}{2} Δ_{j}, {\tilde{θ}}_{j}), \end{matrix} \end{matrix} \end{matrix}

where $Δ_{j}$ is a small increment of $θ_{j}$ in the positive direction, and ${\tilde{θ}}_{j}$ denotes all parameters of $θ$ except for $θ_{j}$ . $θ_{j} + \frac{π}{2} Δ_{j}$ and $θ_{j} - \frac{π}{2} Δ_{j}$ are shift parameters for evaluating the gradient. The parameter shift rule enables precise gradient computation without discretization errors, and its circuit is easily implementable on near-term quantum devices.

Expressibility

Expressibility is a crucial metric for assessing parameterized quantum circuits. It indicates the capability of states generated by parameterized quantum circuits to span the entire Hilbert space. States generated by Haar random unitaries uniformly cover the Hilbert space, thus exhibiting the highest expressibility. The smaller the distance between the state distribution resulting from uniform sampling of a parameterized quantum circuit and the Haar random unitary distribution, the more expressive the circuit becomes. Since the Haar random unitary follows a uniform distribution, greater uniformity in the states generated by randomly sampling parameterized quantum circuits results in higher expressibility. Thus, expressibility can be quantified as the deviation of the probability distribution of states generated by a parameterized quantum circuit from that of the Haar random unitary.

Let $Q (α)$ denote a parameterized quantum circuit with n-qubit input, and $| ψ (α_{1}) ⟩$ and $| ψ (α_{2}) ⟩$ represent two states generated by randomly sampling the parameters of $Q (α)$ . $F_{b} = {| ⟨ ψ (α_{1}) | ψ (α_{2}) ⟩ |}^{t}$ means the t-moment fidelity between $| ψ (α_{1}) ⟩$ and $| ψ (α_{2}) ⟩$ . Assuming 1-moment fidelity is used to describe the expressibility of $Q (α)$ , then $F_{b}$ can be rewritten as $| ⟨ ψ (α_{1}) | ψ (α_{2}) ⟩ |^{2}$ , abbreviated as fidelity. Random sampling states and obtain the fidelity distribution function $P_{b} (F_{b} ; α)$ . Similarly, $F_{h}$ represents the fidelity of the states produced by Haar-random unitary with n-qubit input, and its distribution function is $P_{h} (F_{h}) = (N - 1) {(1 - F_{h})}^{N - 2}$ , where $N = 2^{n}$ is the dimension of Hilbert space.

Let $K L (P_{b} (F_{b} ; α) | | P_{h} (F_{h})))$ denote the KL divergence between the fidelity distribution functions $P_{b} (F_{b} ; α)$ and $P_{h} (F_{h})$ . The smaller the KL divergence, the closer the fidelity distribution of $Q (α)$ is to that of Haar-random unitary. As Haar random unitary has the highest expressibility, the closer the unitary distribution of $Q (α)$ is to the Haar random unitary distribution, the stronger the representation of $Q (α)$ ²⁵. When $K L (P_{b} (F_{b} ; α) | | P_{h} (F_{h}))) = 0$ , $P_{b} (F_{b} ; α)$ is equal to $P_{h} (F_{h})$ , and $Q (α)$ has the highest expressibility. We define the expressibility of $Q (α)$ as

\begin{matrix} \begin{matrix} \begin{matrix} R (Q (α)) = - l o g_{10} (K L (P_{b} (F_{b} ; α) | | P_{h} (F_{h}))), \end{matrix} \end{matrix} \end{matrix}

the larger $R (Q (α))$ , the stronger the expressibility of $Q (α)$ . In the hybrid quantum-classical classification model, expressibility corresponds to the capacity to address the target problem effectively. Greater expressibility results in the output state of the parameterized quantum circuit being closer to the correct solution.

The layout of parameterized quantum circuits can be changed by varying qubit connections and gates. The expressibility of parameterized quantum circuits with different layouts is analyzed in follows. Figure 4 shows 8 types of circuit layouts of 2-qubit unitary module $U_{i}^{j} (θ_{i, j})$ , and most of them are built based on previous studies²⁶. Circuit (1) consists of one-qubit rotation gates $R_{X}$ and $R_{Z}$ and two-qubit controlled gate CNOT. Each CNOT gate acts on one neighboring qubit pair to construct entanglement. Circuits (2) and (3) have a similar layout to circuit (1), except that CNOT gate is replaced with controlled- $R_{Z}$ and controlled- $R_{X}$ gates, respectively. Circuit (4) is implemented by H, CNOT, and $R_{X}$ gates , and it has the lowest expressibility. Circuit (5), first presented in Ref²⁷, comprises $R_{Y}$ and CNOT gates. Circuits (6) and (7) adopt the similar layout to circuit (5), except that CNOT gate is replaced with controlled- $R_{Z}$ and controlled- $R_{X}$ gates, respectively. Circuit (8), denoting arbitrary SU(4)²⁰, has the highest expressibility.

The circuit layouts of unitary module $U_{i}^{j} (θ_{i, j})$ . Each panel represents a circuit layout. $R_{X}$ , $R_{Y}$ , and $R_{Z}$ denote controlled rotation gates around X-axis, Y-axis, and Z-axis, respectively. $M (κ, μ, ν) = R_{Z} (κ) R_{X} (- π / 2) R_{Z} (μ) R_{X} (π / 2) R_{Z} (ν)$ represents arbitrary SU(4) gate.

Table 1 shows the expressibilities of unitary module $U_{i}^{j} (θ_{i, j})$ and BMERA with 8-qubit input, consisting of $U_{i}^{j} (θ_{i, j})$ . Each column corresponds to one circuit layout in Fig. 4. The first row shows the expressibility of the unitary module $U_{i}^{j} (θ_{i, j})$ , and the second row shows the expressibility of BMERA built by $U_{i}^{j} (θ_{i, j})$ . Fidelity distribution is obtained by 10000 samples. We can find that the higher the expressibility of $U_{i}^{j} (θ_{i, j})$ , the higher the expressibility of the corresponding BMERA. Circuit (3) has higher expressibility than circuit (2), and circuit (7) has higher expressibility than circuit (6). This conclusion is consistent with the fact that controlled- $R_{X}$ has higher expressibility than controlled- $R_{Z}$ ²⁶. Circuit (8) has the highest expressibility among all circuit layouts. Figure 5 shows the histograms of the fidelity distributions of circuit (8) and BMERA built by it. The Hilbert space dimension of BMERA with 8-qubit input is 256. This larger dimension makes $P_{h} (F_{h})$ and $P_{b} (F_{b} ; θ)$ near 0, so the X-axis of Fig. 5 only shows the range [0, 0.1]. Simulation results show that when the parameterized quantum circuit has high expressibility, its fidelity distribution is close to that of Haar random unitary.

Table 1.

The expressibility of the unitary model $U_{i}^{j} (θ_{i, j})$ and BMERA.

Circuit	1	2	3	4	5	6	7	8
$V_{i}^{j} (θ)$	0.943	1.036	1.149	0.409	0.609	1.432	1.587	1.854
BMERA	1.159	1.252	1.638	0.179	0.247	1.721	1.854	2.301

Open in a new tab

The histograms of the fidelity distributions. Panels (a–b) show the histograms of the fidelity distributions of the circuit (8) and the corresponding BMERA. The orange line represents the fidelity distribution of the Haar-random unitary. The fidelity distributions of the circuit (8) and BMERA are close to the fidelity distributions of the corresponding Haar-random unitary, respectively.

Numerical simulations and discussions

In this section, we evaluate the performance of the hybrid quantum-classical classification model using the TensorFlow Quantum (TFQ) framework. Initially, we demonstrate the accuracy of the proposed classification model in classical classification tasks. Next, we verify the accuracy of these classification tasks under noise environments. Lastly, we demonstrate the accuracy of the proposed classification model in cluster state excitation discrimination tasks.

Classical data classification

MNIST dataset is a widely used data set in machine learning, consisting of 60000 training samples and 10000 test samples. The samples consist of $28 \times 28$ -dimensional grayscale images, each representing a handwritten digit from 0 to 9. Our simulations primarily focus on binary classification tasks, where we select two categories of handwritten digits for the training set. Due to the limited qubits and shallower circuits of NISQ devices, the samples are reduced to 8-dimensional vectors using principal component analysis (PCA).

Initially, we analyzed the accuracy of the binary classification task. Besides 8 circuit layouts shown in Fig. 4, we introduced an alternative circuit layout, wherein the unitary modules of the first sublayer are implemented by circuit (4), and the unitary modules of the second sublayer are implemented by circuit (5). We adopt the Adaptive moment estimation (Adam) method²⁸ to train the hybrid quantum-classical classification model, with a the training data size of 32, and a learning speed of 0.01. Table 2 shows the accuracies and standard deviations of various binary classification tasks. Figure 6 shows the accuracy and standard deviation in the form of a bar chart.

Table 2.

The accuracies and standard deviations (%).

Circuit	Parameter	5 and 6	0 and 7	3 and 6	4 and 5	7 and 8	4 and 9
1	68	95.42 $\pm$ 0.55	99.06 $\pm$ 0.10	95.92 $\pm$ 0.29	94.35 $\pm$ 0.56	94.16 $\pm$ 0.54	83.39 $\pm$ 0.73
2	85	96.69 $\pm$ 0.27	99.03 $\pm$ 0.75	97.66 $\pm$ 0.20	98.95 $\pm$ 0.24	97.21 $\pm$ 0.32	87.08 $\pm$ 0.37
3	85	96.73 $\pm$ 0.12	98.93 $\pm$ 0.13	98.91 $\pm$ 0.17	98.15 $\pm$ 0.16	97.06 $\pm$ 0.32	87.39 $\pm$ 0.57
4	34	86.64 $\pm$ 0.57	95.49 $\pm$ 0.63	82.28 $\pm$ 0.42	83.00 $\pm$ 0.43	83.68 $\pm$ 1.79	70.84 $\pm$ 0.95
5	34	95.23 $\pm$ 0.05	99.15 $\pm$ 0.05	98.63 $\pm$ 0.16	97.32 $\pm$ 0.61	96.88 $\pm$ 0.27	86.90 $\pm$ 0.15
6	102	96.78 $\pm$ 0.32	99.21 $\pm$ 0.01	98.70 $\pm$ 0.13	98.25 $\pm$ 0.04	97.29 $\pm$ 0.24	87.07 $\pm$ 1.26
7	102	96.76 $\pm$ 0.33	99.27 $\pm$ 0.57	98.94 $\pm$ 0.01	98.04 $\pm$ 0.12	97.32 $\pm$ 0.24	87.71 $\pm$ 0.63
8	255	97.17 $\pm$ 0.41	99.25 $\pm$ 0.34	98.78 $\pm$ 0.06	98.52 $\pm$ 0.15	97.49 $\pm$ 0.13	87.39 $\pm$ 0.94
Alter	34	95.53 $\pm$ 0.23	96.90 $\pm$ 0.29	97.28 $\pm$ 0.12	95.45 $\pm$ 0.53	95.16 $\pm$ 0.58	81.83 $\pm$ 0.71

Open in a new tab

The first column shows the circuit layouts adopted by BMERA, and the second column indicates the number of trainable parameters. The remaining columns represent mean accuracies and standard deviations for six binary classification tasks. The last row denotes mean accuracies and standard deviations of the classification model based on the alternate circuit layout. All accuracies and standard deviations are obtained by five random simulations.

The accuracies and standard deviations (%) for classification models based on different circuit layouts in bar chart.

The accuracy of the classifier based on alternate circuit layout falls between that of classifiers based on circuits (4) and (5). In certain cases, the accuracy may approach or exceed that of the higher accuracy among these two cases. Digits 4 and 9 exhibit similar local features, and many detailed features are lost during the dimensionality reduction process. consequently yielding lower accuracy for the classification task. Circuit (4) has the poorest expressibility, resulting in the classification model based on it having the lowest accuracy among all models. Except for the classification model based on circuit (4) and the task involving the classification of digits 4 and 9, the accuracies of others are no less than 94%.

Upon comparing Tables 1 and 2, we observe a correlation between the accuracy and the expressibility of BMERA. In general, higher expressibility correlates with higher accuracy²⁶. In most binary classification tasks, there is little variation in accuracy among classification models based on circuits (6), (7), and (8). Despite circuit (8) having the highest expressibility among the binary classification models, its accuracy is lower than that of models based on circuits (6) and (7) in certain classification tasks. This discrepancy is primarily attributed to circuit (8) requiring excessive parameters, leading to overfitting. Generally, BMERA models with higher expressibility require more parameters and complex circuit connections. Circuit layouts should be chosen based on circuit scale and accuracy requirements in practical tasks. If there is minimal difference in accuracy among different circuit layouts, we select the BMERA model with fewer parameters, as it is easier to implement on NISQ devices.

In the hybrid quantum-classical classification model, the number of parameters in BMERA grows polynomially with the number of qubits. To reduce the number of parameters, the unitary modules in the same sub-layer adopt the same parameters, drawing inspiration from the identical filter parameters in the same layer of a CNN for translation invariance. Table 3 displays the accuracies and standard deviations for classification models incorporating translation invariance. Figure 7 shows the accuracies and standard deviation in bar chart. In comparison to Tables 2 and 3, classification models lacking translation invariance exhibit higher accuracy but require more parameters than those with translation invariance. Classification models based on circuits (6), (7), and (8) exhibit minor differences in accuracy between those with translation invariance and those without.

Table 3.

The accuracies and standard deviations (%) for classification models based on different circuit layouts (translation invariance).

Circuit	Parameter	5 and 6	0 and 7	3 and 6	4 and 5	7 and 8	4 and 9
1	20	91.90 $\pm$ 0.31	98.08 $\pm$ 0.66	92.50 $\pm$ 0.50	92.08 $\pm$ 0.61	91.33 $\pm$ 0.77	77.69 $\pm$ 1.36
2	37	96.21 $\pm$ 0.41	96.49 $\pm$ 0.57	98.35 $\pm$ 0.42	97.31 $\pm$ 0.60	96.38 $\pm$ 0.45	85.53 $\pm$ 0.61
3	37	96.64 $\pm$ 0.23	99.10 $\pm$ 0.07	98.35 $\pm$ 0.12	96.99 $\pm$ 0.16	96.87 $\pm$ 0.07	87.81 $\pm$ 0.42
4	10	81.40 $\pm$ 1.18	94.55 $\pm$ 0.46	82.57 $\pm$ 0.11	79.61 $\pm$ 0.54	82.51 $\pm$ 2.10	74.06 $\pm$ 0.48
5	10	93.52 $\pm$ 0.31	98.72 $\pm$ 0.34	90.94 $\pm$ 0.45	93.52 $\pm$ 0.12	94.46 $\pm$ 0.42	80.89 $\pm$ 0.29
6	30	96.76 $\pm$ 0.22	99.24 $\pm$ 0.08	98.92 $\pm$ 0.08	97.76 $\pm$ 0.36	96.58 $\pm$ 0.00	87.75 $\pm$ 0.49
7	30	96.62 $\pm$ 0.38	99.08 $\pm$ 0.12	98.75 $\pm$ 0.14	97.27 $\pm$ 0.13	97.06 $\pm$ 0.33	88.19 $\pm$ 0.87
8	75	96.96 $\pm$ 0.76	99.10 $\pm$ 0.25	98.79 $\pm$ 0.22	97.70 $\pm$ 0.43	97.33 $\pm$ 0.27	88.41 $\pm$ 0.11
Alter	14	94.00 $\pm$ 0.39	96.98 $\pm$ 1.00	94.83 $\pm$ 0.47	92.91 $\pm$ 0.35	94.66 $\pm$ 0.64	82.33 $\pm$ 0.59

Open in a new tab

The accuracies and standard deviations (%) for classification models based on different circuit layouts (translation invariance).

Table 4 shows the accuracies and standard deviations of the hybrid quantum-classical classification model based on BMERA (abbreviated as HBMERA), with quantum classification models based on TTN¹³, MERA¹³, and BMERA. Figure 8 shows the accuracies and standard deviation in bar chart.

Table 4.

The accuracies and standard deviations (%) of the TTN, MERA, BMERA, and HBMERA classification models.

Classification	5 and 6	0 and 7	3 and 6	4 and 5	7 and 8	4 and 9
TTN	79.79 $\pm$ 0.35	92.53 $\pm$ 0.18	82.12 $\pm$ 0.32	81.69 $\pm$ 0.47	76.78 $\pm$ 0.39	78.58 $\pm$ 0.16
MERA	94.72 $\pm$ 0.18	98.57 $\pm$ 0.11	95.20 $\pm$ 0.19	94.32 $\pm$ 0.20	94.12 $\pm$ 0.09	85.65 $\pm$ 0.57
BMERA	95.23 $\pm$ 0.05	98.71 $\pm$ 0.00	98.52 $\pm$ 0.66	96.09 $\pm$ 0.13	96.10 $\pm$ 0.07	85.70 $\pm$ 0.42
HBMERA	95.83 $\pm$ 0.30	99.15 $\pm$ 0.05	98.63 $\pm$ 0.16	97.32 $\pm$ 0.61	96.88 $\pm$ 0.27	86.92 $\pm$ 0.15

Open in a new tab

The accuracies and standard deviations (%) of the TTN, MERA, BMERA, and HBMERA classification models.

Figure 9 shows the accuracies of the above four classification models based on circuit (5). The HBMERA classification model exhibits the highest accuracy across all classification tasks. Moreover, the accuracy of the BMERA classification model is lower than that of the HBMERA classification model but higher than those of the TTN and MERA classification models.

The accuracies of classification models based on TNs. Panels (a–f) show the accuracies for 6 classification tasks. The blue, orange, green, and red lines represent the accuracies of the TTN, MERA, BMERA, and HBMERA classification models, respectively.

At present, several classification models have been implemented using tensor networks, such as Unitary Tree Tensor Network (UTTN)²⁹, Residual Matrix Product State (RMPS)³⁰, and Projected Entangled Pair States (PEPS)³¹.The core idea of these models is to view tensor networks as a special type of network, map data to tensor states, and achieve recognition and classification tasks through tensor contraction. Essentially, these models are still implemented using classical computational methods. To further evaluate the performance of BMERA and HBMERA classifiers, we conducted classification experiments on the MNIST and Fashion-MNIST datasets. Due to the limitations of currently available quantum circuit scales, we only implemented binary classification tasks. For the MNIST dataset, we selected classes 5 and 6, and for the Fashion-MNIST dataset, we selected classes 1 and 2. Tables 5 and 6 present the accuracy of various classification models applied to MNIST and Fashion-MNIST dataset. These models include 1-layer Neural Network (1-layer NN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Fully Connected Network (FCN)), UTTN, RMPS, PEPS, along with the proposed BMERA and HBMERA models.

Table 5.

Accuracy of BMERA, HBMERA and classical models on MNIST dataset.

Models	BMERA	HBMERA	1-layer NN	CNN	RNN	FCN	UTTN	RMPS	PEPS
Accuracy	97.13	97.35	97.70	99.73	100	99.05	98.16	99.09	99.22

Open in a new tab

Table 6.

Accuracy of BMERA, HBMERA and classical models on Fashion-MNIST dataset.

Models	BMERA	HBMERA	1-layer NN	CNN	RNN	FCN	UTTN	RMPS	PEPS
Accuracy	97.90	98.45	98.67	99.70	99.00	99.32	99.10	98.97	99.85

Open in a new tab

From the simulation results, we observe that the accuracies of the BMERA and HBMERA models are lower than those of commonly used neural networks and TN models. This is primarily due to the scale limitations of existing quantum circuits, which make it challenging to process high-dimensional data. Dimensionality reduction is required before classical data can be encoded into quantum states, which impacts the overall accuracy of the algorithm.

Classical data classification under noise environment

In this subsection, we simulate depolarization noise acting on an 8-qubit quantum system and compare classification accuracies in both noisy and noiseless environments. The state $ρ$ is replaced by a completely hybrid state $\tilde{ρ} = s I / 2 + (1 - s) ρ$ , with s probability of I/2 and $1 - s$ probability of $ρ$ under depolarization noise environment. s is restricted in a small region [0, 0.05] to ensure the state $ρ$ undergoes minimal alteration. Tables 7 and 8 show the accuracies and standard deviations of the HBMERA classification model based on the circuit (5), with depolarization noise acting on the initial state and the whole quantum system, respectively. Figures 10 and 11 show the corresponding accuracy and standard deviation in bar chart. Figure 12 shows the accuracies of binary classification task in noise and noiseless environments.

Table 7.

The accuracies and standard deviations (%) of the HBMERA classification model with depolarization noise acting on input states.

Noise	5 and 6	0 and 7	3 and 6	4 and 5	7 and 8	4 and 9
s=0	95.83 $\pm$ 0.30	99.15 $\pm$ 0.05	98.63 $\pm$ 0.16	97.32 $\pm$ 0.61	96.88 $\pm$ 0.27	86.92 $\pm$ 0.15
s=0.01	97.15 $\pm$ 0.19	99.21 $\pm$ 0.05	98.48 $\pm$ 0.10	96.89 $\pm$ 0.41	96.15 $\pm$ 0.14	84.54 $\pm$ 0.43
s=0.02	96.95 $\pm$ 0.39	98.53 $\pm$ 0.02	98.90 $\pm$ 0.02	97.40 $\pm$ 0.19	95.45 $\pm$ 0.42	86.12 $\pm$ 0.56
s=0.03	97.00 $\pm$ 0.56	99.06 $\pm$ 0.16	98.02 $\pm$ 0.37	96.98 $\pm$ 0.38	96.94 $\pm$ 0.45	85.55 $\pm$ 0.68
s=0.04	95.71 $\pm$ 0.14	99.02 $\pm$ 0.08	98.05 $\pm$ 0.48	97.14 $\pm$ 0.18	96.76 $\pm$ 0.25	85.53 $\pm$ 0.49
s=0.05	96.32 $\pm$ 0.27	98.69 $\pm$ 0.07	98.75 $\pm$ 0.08	96.77 $\pm$ 0.14	96.36 $\pm$ 0.33	82.54 $\pm$ 0.68

Open in a new tab

The first column describes the noise probability s, and the remaining columns represent the mean accuracies and standard deviations of 6 binary classification tasks of MNIST dataset.

Table 8.

The accuracies and standard deviations (%) of the HBMERA classification model with depolarization noise acting on the whole quantum system.

Classification	5 and 6	0 and 7	3 and 6	4 and 5	7 and 8	4 and 9
s=0	95.83 $\pm$ 0.30	99.15 $\pm$ 0.05	98.63 $\pm$ 0.16	97.32 $\pm$ 0.61	96.88 $\pm$ 0.27	86.92 $\pm$ 0.15
s=0.01	96.12 $\pm$ 0.11	98.51 $\pm$ 0.10	98.41 $\pm$ 0.12	96.52 $\pm$ 0.30	96.38 $\pm$ 0.31	86.40 $\pm$ 0.37
s=0.02	96.70 $\pm$ 0.06	98.94 $\pm$ 0.06	98.10 $\pm$ 0.46	97.30 $\pm$ 0.41	95.50 $\pm$ 0.28	85.54 $\pm$ 0.54
s=0.03	96.70 $\pm$ 0.41	96.54 $\pm$ 0.19	98.12 $\pm$ 0.13	96.61 $\pm$ 0.28	96.71 $\pm$ 0.29	84.31 $\pm$ 0.96
s=0.04	96.63 $\pm$ 0.29	98.91 $\pm$ 0.28	98.10 $\pm$ 0.03	97.22 $\pm$ 0.22	97.35 $\pm$ 0.08	85.32 $\pm$ 0.31
s=0.05	96.38 $\pm$ 0.19	98.28 $\pm$ 0.18	98.63 $\pm$ 0.13	97.35 $\pm$ 0.23	95.07 $\pm$ 0.42	86.60 $\pm$ 0.57

Open in a new tab

The accuracies and standard deviations (%) of the HBMERA classification model with depolarization noise acting on input states.

The accuracies of the HBMERA classification model under noise and noiseless environments. Panels (a–b) show the accuracies of classifying digits 4 and 5 with depolarization noise acting on input states and the whole quantum system, respectively. Solid lines represent the accuracies under noiseless environments, and dotted lines represent the accuracies under depolarization noise environments.

The accuracies and standard deviations (%) of the HBMERA classification model with depolarization noise acting on the whole quantum system.

Simulation results indicate that the accuracy discrepancy for the identical binary classification task between noisy and noiseless environments ranges from 0 to 0.05. The maximum discrepancy, observed in the classifying task distinguishing digits 4 and 9, is 0.0338. Notably, this particular classification task also exhibits the lowest accuracy in a noiseless environment. Interestingly, in certain instances, the mean accuracies in noisy environments surpass those in noiseless environments. Simulation results suggest that depolarization noise minimally affects classification accuracy when s is below the threshold of 0.05. We can find that classification accuracy changes minimally compared to the noiseless environment, regardless of whether the noise affects the input state or the whole system. Thus, the HBMERA classification model demonstrates good robustness under depolarization noise.

Quantum state discrimination

In this subsection, we evaluate the performance of the HBMERA classification model in discriminating cluster state excitations. Cluster states, highly entangled states, serve as common initial states for measurement-based quantum systems. However, due to their higher dimensions, cluster states require exponentially increasing resources for data processing as the number of qubits grows. Consequently, discriminating cluster state excitations using classical computers is a challenging¹⁶. We conduct a discriminative experiment on cluster state excitation. The preparation process for cluster states is as follows:

Initialize the state ${| 0 ⟩}^{n}$ , where n represents the number of qubits. For an 8-qubit cluster state, the initial state is ${| 0 ⟩}^{8}$ .
Apply Hadamard gates to each qubit to create an superposition state ${| + ⟩}^{8}$ , where $| + ⟩ = (| 0 ⟩ + | 1 ⟩) / \sqrt{2}$ .
Apply Controlled-Z(CZ) gates between adjacent qubits. A CZ gate applies a phase flip (Z operation) to the target qubit depending on the state of the control qubit. This operation can be written as
$\begin{matrix} \begin{matrix} C Z = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & - 1 \end{matrix}] . \end{matrix} \end{matrix}$ 15
For qubits i and $i + 1$ , CZ operation is described as
$\begin{matrix} \begin{matrix} C Z_{i, i + 1} = \{\begin{matrix} I_{i + 1} & i f q u b i t i i s | 0 ⟩ \\ Z_{i + 1} & i f q u b i t i i s | 1 ⟩ \end{matrix}), \end{matrix} \end{matrix}$ 16
where $I_{i + 1}$ represents identity operation applied to the $(i + 1)$ th qubit and $Z_{i + 1}$ represents Z operation applied to the $(i + 1)$ th qubit.
Repeat applying CZ gates sequentially between adjacent qubits until all desired pairs have been operated upon.
In the excitation state preparation process, the RX gate acts on each qubit with a random rotation angle within the range of $[- π, π]$ . Subsequently, labels are assigned to cluster states based on the rotation angle: if the angle is between $- π / 2$ and $π / 2$ , the label is assigned as 1, indicating excitation; otherwise, it is assigned as -1, indicating no excitation.

In this simulation, the training set includes 800 cluster and test set including 200 test set. The training process consists of 25 epochs with a batch size of 32. Table 9 shows the accuracies and standard deviations for discriminating 8-qubit and 16-qubit cluster states excitation using TTN¹³, MERA¹³, BMERA, and HBMERA classification models, with each datum obtained from 5 simulations. Figure 13 shows the corresponding accuracy and standard deviation in bar chart. Figure 14 shows the accuracies of discriminating 8-qubit and 16-qubit cluster states excitations. Simulation results reveal that the HBMERA classification model achieves mean accuracies of 99.06% and 97.92% in discriminating 8-qubit and 16-qubit cluster state excitations, respectively, surpassing those of the TTN, MERA, and BMERA classification models. The HBMERA classification model demonstrates superior accuracy in discriminating cluster state excitations.

Table 9.

The accuracies and standard deviations (%) for discriminating cluster state excitation.

State	TTN	MERA	BMERA	HBMERA
8-qubit cluster	74.38 $\pm 6.50$	96.25 $\pm 7.52$	95.78 $\pm 3.69$	99.06 $\pm 1.02$
16-qubit cluster	79.19 $\pm 8.46$	87.42 $\pm 0.60$	93.62 $\pm 0.43$	97.92 $\pm 0.60$

Open in a new tab

The first and second rows represent the accuracies of discriminating 8-qubit and 16-qubit cluster states excitation.

The accuracies and standard deviations (%) for discriminating cluster state excitation.

The accuracies of discriminating cluster state excitation or not. Panels (a–b) show the accuracies for discriminating 8-qubit and 16-qubit cluster states excitation. The solid lines represent the accuracies of the HBMERA classification model. The blue, orange, and green dotted lines represent the accuracies of the TTN, MERA, and BMERA classification models, respectively.

Conclusions and future work

Hybrid quantum-classical algorithms offer a promising avenue for integrating NISQ devices into machine learning tasks. Leveraging the entanglement benefits of tensor networks, we propose a hybrid quantum-classical classification model based on branching multi-scale entanglement renormalization ansatz (BMERA). This model enhances its ability to approximate nonlinear functions through qubit encoding and an optimized circuit structure. Additionally, the proposed HBMERA model achieves a shallower circuit depth by shifting some computational complexity from quantum devices to classical neural networks, making it more compatible with NISQ devices. Simulation results demonstrate that the BMERA classification model achieves higher classification performance in classical and quantum data classification tasks and exhibits robustness to depolarization noise. However, the accuracy of the BMERA model is lower than that of classical neural networks and TN models. This is mainly because the architecture and optimization methods of classification models have become increasingly mature and sophisticated after years of development. In contrast, quantum computers face limitations in terms of qubit, circuit depth, and error correction. These limitations necessitate preprocessing operations, such as dimensionality reduction of classical data, before executing the operation, which constrains the potential for further improving classification accuracy.

With advancements in quantum computers regarding scale and fault tolerance, these limitations will be greatly alleviated, potentially giving quantum models a competitive edge. For highly complex tasks, quantum models can more effectively capture global features and intricate relationships within the data, representing complex functions that are challenging for classical neural networks. This capability stems from the utilization of quantum superposition and entangled states, which may enable quantum neural networks to achieve higher accuracy in certain tasks. Currently, research on quantum classification models based on tensor network structures is in its early stages. The framework and architecture of quantum models require further optimization, which is an important direction for our future research. As the field progresses, we anticipate that improvements in quantum model design will lead to enhanced performance and new capabilities in machine learning.

LaTeX formats citations and references automatically using the bibliography records in your .bib file, which you can edit via the project menu. Use the cite command for an inline citation, e.g.³².

For data citations of datasets uploaded to e.g. figshare, please use the [SPSVERBc1SPS] option in the bib entry to specify the platform and the link, as in the [SPSVERBc2SPS] example in the sample bibliography file.

Acknowledgements

This work was supported by the Open Fund of Advanced Cryptography and System Security Key Laboratory of Sichuan Province (Grant No. SKLACSS-202108), Scientific Research Fund of Zaozhuang University (No.102061901), Shandong Province College Student Innovation and Entrepreneurship Training Program Project (S202310904040).

Author contributions

Y.-Y.H. wrote the manuscript and conducted the experiment(s), L.J. conceived the experiment(s), X.T. analysed the results, X.-Y.L. reviewed the manuscript.

Data availability

Data is provided within the manuscript or supplementary information files. Codes will be made available on request.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Rebentrost, P., Mohseni, M. & Lloyd, S. Quantum support vector machine for big data classification. Phys. Rev. Lett.113, 130503 (2014). [DOI] [PubMed] [Google Scholar]
2.Schuld, M. & Killoran, N. Quantum machine learning in feature hilbert spaces. Phys. Rev. Lett.122, 040504 (2019). [DOI] [PubMed] [Google Scholar]
3.Wang, Y., Lin, K.-Y., Cheng, S. & Li, L. Variational quantum extreme learning machine. Neurocomputing512, 83–99 (2022). [Google Scholar]
4.Wang, Y., Wang, Y., Chen, C., Jiang, R. & Huang, W. Development of variational quantum deep neural networks for image recognition. Neurocomputing501, 566–582 (2022). [Google Scholar]
5.Chen, Y., Wang, C., Guo, H., Gao, X. & Wu, J. Accelerating spiking neural networks using quantum algorithm with high success probability and high calculation accuracy. Neurocomputing493, 435–444 (2022). [Google Scholar]
6.Martín-Guerrero, J. D. & Lamata, L. Quantum machine learning: A tutorial. Neurocomputing470, 457–461 (2022). [Google Scholar]
7.Huang, R., Tan, X. & Xu, Q. Variational quantum tensor networks classifiers. Neurocomputing452, 89–98 (2021). [Google Scholar]
8.Cohen, N., Sharir, O. & Shashua, A. On the expressive power of deep learning: A tensor analysis. In Conference on learning theory, 698–728 (PMLR, 2016).
9.Stoudenmire, E. & Schwab, D. J. Supervised learning with tensor networks. Adv. Neural Inf. Process. Syst.29, 4799 (2016). [Google Scholar]
10.Wall, M. L., Abernathy, M. R. & Quiroz, G. Generative machine learning with tensor networks: Benchmarks on near-term quantum computers. Phys. Rev. Res.3, 023010 (2021). [Google Scholar]
11.Cheng, S., Wang, L., Xiang, T. & Zhang, P. Tree tensor networks for generative modeling. Phys. Rev. B99, 155131 (2019). [Google Scholar]
12.Benedetti, M. et al. A generative modeling approach for benchmarking and training shallow quantum circuits. npj Quantum Inf.5, 1–9 (2019). [Google Scholar]
13.Grant, E. et al. Hierarchical quantum classifiers. npj Quantum Inf.4, 1–8 (2018). [Google Scholar]
14.Pesah, A. et al. Absence of barren plateaus in quantum convolutional neural networks. Phys. Rev. X11, 041011 (2021). [Google Scholar]
15.Vidal, G. Class of quantum many-body states that can be efficiently simulated. Phys. Rev. Lett.101, 110501 (2008). [DOI] [PubMed] [Google Scholar]
16.Broughton, M. et al. Tensorflow quantum: A software framework for quantum machine learning. arXiv preprintarXiv:2003.02989 (2020).
17.Verdon, G., Pye, J. & Broughton, M. A universal training algorithm for quantum deep learning. arXiv preprintarXiv:1806.09729 (2018).
18.Romero, J. & Aspuru-Guzik, A. Variational quantum generators: Generative adversarial quantum machine learning for continuous distributions. Adv. Quantum Technol.4, 2000003 (2021). [Google Scholar]
19.Skolik, A., McClean, J. R., Mohseni, M., van der Smagt, P. & Leib, M. Layerwise learning for quantum neural networks. Quantum Mach. Intell.3, 1–11 (2021). [Google Scholar]
20.MacCormack, I., Delaney, C., Galda, A. & Narang, P. Branching quantum convolutional neural networks: A variational ansatz with mid-circuit measurements. Bull. Am. Phys. Soc.4(1), 013117 (2021). [Google Scholar]
21.Li, W. & Deng, D.-L. Recent advances for quantum classifiers. Sci. China Physi. Mech. Astron.65, 220301 (2022). [Google Scholar]
22.Cong, I., Choi, S. & Lukin, M. D. Quantum convolutional neural networks. Nat. Phys.15, 1273–1278 (2019). [Google Scholar]
23.Harrow, A. W. & Napp, J. C. Low-depth gradient measurements can improve convergence in variational hybrid quantum-classical algorithms. Phys. Rev. Lett.126, 140502 (2021). [DOI] [PubMed] [Google Scholar]
24.Schuld, M., Bergholm, V., Gogolin, C., Izaac, J. & Killoran, N. Evaluating analytic gradients on quantum hardware. Phys. Rev. A99, 032331 (2019). [Google Scholar]
25.Hubregtsen, T., Pichlmeier, J., Stecher, P. & Bertels, K. Evaluation of parameterized quantum circuits: On the relation between classification accuracy, expressibility, and entangling capability. Quantum Mach. Intell.3, 1–19 (2021). [Google Scholar]
26.Sim, S., Johnson, P. D. & Aspuru-Guzik, A. Expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms. Adv. Quantum Technol.2, 1900070 (2019). [Google Scholar]
27.Peruzzo, A. et al. A variational eigenvalue solver on a photonic quantum processor. Nat. Commun.5, 1–7 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. arXiv preprintarXiv:1412.6980 (2014).
29.Liu, D. et al. Machine learning by unitary tensor network of hierarchical tree structure. New J. Phys.21, 073059 (2019). [Google Scholar]
30.Meng, Y.-M., Zhang, J., Zhang, P., Gao, C. & Ran, S.-J. Residual matrix product state for machine learning. SciPost Phys.14, 142 (2023). [Google Scholar]
31.Cheng, S., Wang, L. & Zhang, P. Supervised learning with projected entangled pair states. Phys. Rev. B103, 125117 (2021). [Google Scholar]
32.Hao, Z., AghaKouchak, A., Nakhjiri, N. & Farahmand, A. Global integrated drought monitoring and prediction system (GIDMaPS) data sets. figshare10.6084/m9.figshare.853801 (2014). [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data is provided within the manuscript or supplementary information files. Codes will be made available on request.

[CR1] 1.Rebentrost, P., Mohseni, M. & Lloyd, S. Quantum support vector machine for big data classification. Phys. Rev. Lett.113, 130503 (2014). [DOI] [PubMed] [Google Scholar]

[CR2] 2.Schuld, M. & Killoran, N. Quantum machine learning in feature hilbert spaces. Phys. Rev. Lett.122, 040504 (2019). [DOI] [PubMed] [Google Scholar]

[CR3] 3.Wang, Y., Lin, K.-Y., Cheng, S. & Li, L. Variational quantum extreme learning machine. Neurocomputing512, 83–99 (2022). [Google Scholar]

[CR4] 4.Wang, Y., Wang, Y., Chen, C., Jiang, R. & Huang, W. Development of variational quantum deep neural networks for image recognition. Neurocomputing501, 566–582 (2022). [Google Scholar]

[CR5] 5.Chen, Y., Wang, C., Guo, H., Gao, X. & Wu, J. Accelerating spiking neural networks using quantum algorithm with high success probability and high calculation accuracy. Neurocomputing493, 435–444 (2022). [Google Scholar]

[CR6] 6.Martín-Guerrero, J. D. & Lamata, L. Quantum machine learning: A tutorial. Neurocomputing470, 457–461 (2022). [Google Scholar]

[CR7] 7.Huang, R., Tan, X. & Xu, Q. Variational quantum tensor networks classifiers. Neurocomputing452, 89–98 (2021). [Google Scholar]

[CR8] 8.Cohen, N., Sharir, O. & Shashua, A. On the expressive power of deep learning: A tensor analysis. In Conference on learning theory, 698–728 (PMLR, 2016).

[CR9] 9.Stoudenmire, E. & Schwab, D. J. Supervised learning with tensor networks. Adv. Neural Inf. Process. Syst.29, 4799 (2016). [Google Scholar]

[CR10] 10.Wall, M. L., Abernathy, M. R. & Quiroz, G. Generative machine learning with tensor networks: Benchmarks on near-term quantum computers. Phys. Rev. Res.3, 023010 (2021). [Google Scholar]

[CR11] 11.Cheng, S., Wang, L., Xiang, T. & Zhang, P. Tree tensor networks for generative modeling. Phys. Rev. B99, 155131 (2019). [Google Scholar]

[CR12] 12.Benedetti, M. et al. A generative modeling approach for benchmarking and training shallow quantum circuits. npj Quantum Inf.5, 1–9 (2019). [Google Scholar]

[CR13] 13.Grant, E. et al. Hierarchical quantum classifiers. npj Quantum Inf.4, 1–8 (2018). [Google Scholar]

[CR14] 14.Pesah, A. et al. Absence of barren plateaus in quantum convolutional neural networks. Phys. Rev. X11, 041011 (2021). [Google Scholar]

[CR15] 15.Vidal, G. Class of quantum many-body states that can be efficiently simulated. Phys. Rev. Lett.101, 110501 (2008). [DOI] [PubMed] [Google Scholar]

[CR16] 16.Broughton, M. et al. Tensorflow quantum: A software framework for quantum machine learning. arXiv preprintarXiv:2003.02989 (2020).

[CR17] 17.Verdon, G., Pye, J. & Broughton, M. A universal training algorithm for quantum deep learning. arXiv preprintarXiv:1806.09729 (2018).

[CR18] 18.Romero, J. & Aspuru-Guzik, A. Variational quantum generators: Generative adversarial quantum machine learning for continuous distributions. Adv. Quantum Technol.4, 2000003 (2021). [Google Scholar]

[CR19] 19.Skolik, A., McClean, J. R., Mohseni, M., van der Smagt, P. & Leib, M. Layerwise learning for quantum neural networks. Quantum Mach. Intell.3, 1–11 (2021). [Google Scholar]

[CR20] 20.MacCormack, I., Delaney, C., Galda, A. & Narang, P. Branching quantum convolutional neural networks: A variational ansatz with mid-circuit measurements. Bull. Am. Phys. Soc.4(1), 013117 (2021). [Google Scholar]

[CR21] 21.Li, W. & Deng, D.-L. Recent advances for quantum classifiers. Sci. China Physi. Mech. Astron.65, 220301 (2022). [Google Scholar]

[CR22] 22.Cong, I., Choi, S. & Lukin, M. D. Quantum convolutional neural networks. Nat. Phys.15, 1273–1278 (2019). [Google Scholar]

[CR23] 23.Harrow, A. W. & Napp, J. C. Low-depth gradient measurements can improve convergence in variational hybrid quantum-classical algorithms. Phys. Rev. Lett.126, 140502 (2021). [DOI] [PubMed] [Google Scholar]

[CR24] 24.Schuld, M., Bergholm, V., Gogolin, C., Izaac, J. & Killoran, N. Evaluating analytic gradients on quantum hardware. Phys. Rev. A99, 032331 (2019). [Google Scholar]

[CR25] 25.Hubregtsen, T., Pichlmeier, J., Stecher, P. & Bertels, K. Evaluation of parameterized quantum circuits: On the relation between classification accuracy, expressibility, and entangling capability. Quantum Mach. Intell.3, 1–19 (2021). [Google Scholar]

[CR26] 26.Sim, S., Johnson, P. D. & Aspuru-Guzik, A. Expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms. Adv. Quantum Technol.2, 1900070 (2019). [Google Scholar]

[CR27] 27.Peruzzo, A. et al. A variational eigenvalue solver on a photonic quantum processor. Nat. Commun.5, 1–7 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR28] 28.Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. arXiv preprintarXiv:1412.6980 (2014).

[CR29] 29.Liu, D. et al. Machine learning by unitary tensor network of hierarchical tree structure. New J. Phys.21, 073059 (2019). [Google Scholar]

[CR30] 30.Meng, Y.-M., Zhang, J., Zhang, P., Gao, C. & Ran, S.-J. Residual matrix product state for machine learning. SciPost Phys.14, 142 (2023). [Google Scholar]

[CR31] 31.Cheng, S., Wang, L. & Zhang, P. Supervised learning with projected entangled pair states. Phys. Rev. B103, 125117 (2021). [Google Scholar]

[CR32] 32.Hao, Z., AghaKouchak, A., Nakhjiri, N. & Farahmand, A. Global integrated drought monitoring and prediction system (GIDMaPS) data sets. figshare10.6084/m9.figshare.853801 (2014). [DOI] [PMC free article] [PubMed]

PERMALINK

A hybrid quantum-classical classification model based on branching multi-scale entanglement renormalization ansatz

Yan-Yan Hou

Jian Li

Tao Xu

Xin-Yu Liu

Abstract

Introduction

Method

Figure 1.

Map data into quantum states

Parameterized quantum circuit based on BMERA

Figure 2.

Figure 3.

Cost function and optimization

Expressibility

Figure 4.

Table 1.

Figure 5.

Numerical simulations and discussions

Classical data classification

Table 2.

Figure 6.

Table 3.

Figure 7.

Table 4.

Figure 8.

Figure 9.

Table 5.

Table 6.

Classical data classification under noise environment

Table 7.

Table 8.

Figure 10.

Figure 12.

Figure 11.

Quantum state discrimination

Table 9.

Figure 13.

Figure 14.

Conclusions and future work

Acknowledgements

Author contributions

Data availability

Competing interests

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases