Skip to main content
Log in

Quantum autoencoders for communication-efficient cloud computing

  • Research
  • Published:
Quantum Machine Intelligence Aims and scope Submit manuscript

Abstract

In the model of quantum cloud computing, the server executes a computation on the quantum data provided by the client. In this scenario, it is important to reduce the amount of quantum communication between the client and the server. A possible approach is to transform the desired computation into a compressed version that acts on a smaller number of qubits, thereby reducing the amount of data exchanged between the client and the server. Here we propose quantum autoencoders for quantum gates (QAEGate) as a method for compressing quantum computations. We illustrate it in concrete scenarios of single-round and multi-round communication and validate it through numerical experiments. A bonus of our method is it does not reveal any information about the server’s computation other than the information present in the output.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Availability of data and materials

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  • Preskill J (2018) Quantum computing in the NISQ era and beyond. Quantum 2:79

    Article  Google Scholar 

  • Arute F et al (2019) Quantum supremacy using a programmable superconducting processor. Nature 574(7779):505–510

    Article  Google Scholar 

  • Zhong H-S et al (2020) Quantum computational advantage using photons. Science 370(6523):1460–1463

    Article  Google Scholar 

  • Broadbent A, Fitzsimons J, Kashefi E (2009) Universal blind quantum computation. In: 2009 50th Annual IEEE Symposium on Foundations of Computer Science. IEEE, pp 517–526

  • Arrighi P, Salvail L (2006) Blind quantum computation. Int J Quantum Inf 4(05):883–898

    Article  Google Scholar 

  • Morimae T, Fujii K (2013) Blind quantum computation protocol in which alice only makes measurements. Phys Rev A 87(5):050301

    Article  Google Scholar 

  • Schmidhuber J (2015) Deep learning in neural networks: An overview. Neural Netw 61:85–117

    Article  Google Scholar 

  • Romero J, Olson JP, Aspuru-Guzik A (2017) Quantum autoencoders for efficient compression of quantum data. Quantum Sci Technol 2(4):045001

    Article  Google Scholar 

  • Bottou L (2012) Stochastic gradient descent tricks. In: Neural Networks: Tricks of the Trade. Springer, pp 421–436

  • Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I et al (2010) A view of cloud computing. Commun ACM 53(4):50–58

    Article  Google Scholar 

  • Barz S, Kashefi E, Broadbent A, Fitzsimons JF, Zeilinger A, Walther P (2012) Demonstration of blind quantum computing. Science 335(6066):303–308

    Article  MathSciNet  Google Scholar 

  • Yang Y, Chiribella G, Hayashi M (2020) Communication cost of quantum processes. IEEE J Sel Areas Inf Theory 1(2):387–400

    Article  Google Scholar 

  • Sheng Y-B, Zhou L (2017) Distributed secure quantum machine learning. Sci Bull 62(14):1025–1029

    Article  Google Scholar 

  • Bondarenko D, Feldmann P (2020) Quantum autoencoders to denoise quantum data. Phys Rev Lett 124(13):130502

    Article  Google Scholar 

  • Achache T, Horesh L, Smolin J (2020) Denoising quantum states with Quantum Autoencoders–Theory and Applications . arXiv preprint arXiv:2012.14714

  • Nielsen MA, Chuang IL (1997) Programmable quantum gate arrays. Phys Rev Lett 79(2):321

    Article  MathSciNet  Google Scholar 

  • Yang Y, Renner R, Chiribella G (2020) Optimal universal programming of unitary gates. Phys Rev Lett 125(21):210501

    Article  MathSciNet  Google Scholar 

  • Choi M-D (1975) Completely positive linear maps on complex matrices. Linear Algebra Appl 10(3):285–290

    Article  MathSciNet  Google Scholar 

  • Jamiołkowski A (1972) Linear transformations which preserve trace and positive semidefiniteness of operators. Rep Math Phys 3(4):275–278

    Article  MathSciNet  Google Scholar 

  • Nielsen MA, Chuang I (2002) Quantum computation and quantum information. American Association of Physics Teachers

  • Chiribella G, D’Ariano GM, Perinotti P (2008) Transforming quantum operations: Quantum supermaps. EPL Europhys Lett 83(3):30004

    Article  Google Scholar 

  • Chiribella G, D’Ariano GM, Perinotti P (2009) Theoretical framework for quantum networks. Phys Rev A 80(2):022339

  • Cong I, Choi S, Lukin MD (2019) Quantum convolutional neural networks. Nat Phys 15(12):1273–1278

  • Farhi E, Goldstone J, Gutmann S (2014) A quantum approximate optimization algorithm. arXiv preprint arXiv:1411.4028

  • Debnath S, Linke NM, Figgatt C, Landsman KA, Wright K, Monroe C (2016) Demonstration of a small programmable quantum computer with atomic qubits. Nature 536(7614):63–66

    Article  Google Scholar 

  • Kiefer J, Wolfowitz J et al (1952) Stochastic estimation of the maximum of a regression function. Ann Math Stat 23(3):462–466

    Article  MathSciNet  Google Scholar 

  • McClean JR, Boixo S, Smelyanskiy VN, Babbush R, Neven H (2018) Barren plateaus in quantum neural network training landscapes. Nat Commun 9(1):4812

    Article  Google Scholar 

  • Broughton M, Verdon G, McCourt T, Martinez AJ, Yoo JH, Isakov SV, Massey P, Halavati R, Niu MY, Zlokapa A et al (2020) Tensorflow quantum: A software framework for quantum machine learning. arXiv preprint arXiv:2003.02989

  • Baxter RJ (2007) Exactly Solved Models in Statistical Mechanics. Courier Corporation

  • Williamson DF, Parker RA, Kendrick JS (1989) The box plot: a simple visual method to interpret data. Ann Intern Med 110(11):916–921

    Article  Google Scholar 

  • Bisio A, Chiribella G, D’Ariano GM, Facchini S, Perinotti P (2010) Optimal quantum learning of a unitary transformation. Phys Rev A 81(3):032324

    Article  Google Scholar 

  • Mo Y, Chiribella G (2019) Quantum-enhanced learning of rotations about an unknown direction. New J Phys 21(11):113003

    Article  MathSciNet  Google Scholar 

  • Sedlák M, Ziman M (2020) Probabilistic storage and retrieval of qubit phase gates. Phys Rev A 102(3):032618

    Article  Google Scholar 

  • Bishop LS, Bravyi S, Cross A, Gambetta JM, Smolin J (2017) Quantum volume. Quantum Volume, Technical Report

    Google Scholar 

  • Moll N, Barkoutsos P, Bishop LS, Chow JM, Cross A, Egger DJ, Filipp S, Fuhrer A, Gambetta JM, Ganzhorn M et al (2018) Quantum optimization using variational algorithms on near-term quantum devices. Quantum Sci Technol 3(3):030503

    Article  Google Scholar 

  • Allen-Zhu Z (2017) Natasha 2: Faster non-convex optimization than sgd. arXiv preprint arXiv:1708.08694

  • Sweke R, Wilde F, Meyer JJ, Schuld M, Fährmann PK, Meynard-Piganeau B, Eisert J (2020) Stochastic gradient descent for hybrid quantum-classical optimization. Quantum 4:314

    Article  Google Scholar 

  • Developers Cirq (2022). Cirq Zenodo. https://doi.org/10.5281/ZENODO.7465577

  • Mitarai K, Negoro M, Kitagawa M, Fujii K (2018) Quantum circuit learning. Physical Review A 98(3):032309

    Google Scholar 

Download references

Funding

This work was supported by funding from the Hong Kong Research Grant Council through grants no. 17300918 and no. 17307520, through the Senior Research Fellowship Scheme SRFS2021-7S02, the Croucher Foundation, and by the John Templeton Foundation through grant 62312, The Quantum Information Structure of Spacetime (qiss.fr). YXW acknowledges funding from the National Natural Science Foundation of China through grants no. 61872318. Research at the Perimeter Institute is supported by the Government of Canada through the Department of Innovation, Science and Economic Development Canada and by the Province of Ontario through the Ministry of Research, Innovation and Science. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the John Templeton Foundation.

Author information

Authors and Affiliations

Authors

Contributions

Zhu and Li provided the main ideas for the work, while Zhu, Bai, and Wang designed the numerical experiments. Zhu and Chiribella drafted the main manuscript text, and all authors reviewed the manuscript.

Corresponding authors

Correspondence to Tongyang Li or Giulio Chiribella.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 A. Proof of Theorem 1

According to Allen-Zhu (2017), we have the following lemma about convergence of SGD for nonconvex functions:

Lemma 1

If a function f(x) is \(L_1\)-smooth and we perform SGD update \(x_{t+1}\leftarrow x_t - \eta \nabla f_i(x_t)\) each time for a random \(i\in [n]\), then the convergence rate is \(T = \mathcal {O}(\frac{L_1}{\epsilon ^4})\) for finding \(\mathbb {E}[\Vert \nabla f(x) \Vert ^2]\le \epsilon ^2\).

Therefore, we first discuss the smoothness property of the loss function \(\mathcal {L}(\varvec{\theta })\) in the training of QAEGate model. Note that the first- and second-order Lipschitz smoothness are equivalent to Lipschitz continuity of the first- and second-order derivatives. So we start with a lemma about the Lipschitz continuity of multivariate functions (Sweke et al. 2020).

Lemma 2

Given some function \(f :\textbf{R}^M \rightarrow \textbf{R}\), if all partial derivatives of f are continuous, then for any \(a, b \in \textbf{R}\) the function \(f :[a, b]^M \rightarrow \textbf{R}\) is L-Lipschitz continuous with

$$\begin{aligned} L = \sqrt{M}\underset{j \in \{1,\dots ,M\}}{\textrm{max}} \sup _{\varvec{x}\in [a, b]^M} \left| \frac{\partial f(\varvec{x})}{\partial x_j}\right| . \end{aligned}$$
(3)

Next we prove a more general result about the smoothness of measurement probability of quantum circuits, and then show that \(\mathcal {L}(\varvec{\theta })\) fits into this result.

Lemma 3

Consider a quantum circuit consisting of any number of fixed unitary gates and M variable unitary gates \(U_1(\theta _1),\dots ,U_M(\theta _M)\), where \(U_j(\theta _j):= \textrm{exp}(i\theta _j H_j)\) for some Hermitian operator \(H_j\). Then the probability of any measurement outcome of the output of this circuit is \(L_1\)-smooth and \(L_2\)-second-order smooth with respect to \(\varvec{\theta }=(\theta _1,\dots ,\theta _M)\), where

$$\begin{aligned} L_1= & {} 4MH_{\textrm{max}}^2\end{aligned}$$
(4)
$$\begin{aligned} L_2= & {} 8M^{3/2}H_{\textrm{max}}^3 \end{aligned}$$
(5)

where \(H_{\textrm{max}}:=\textrm{max}_{j} \Vert H_j\Vert _2\) and \(\Vert \cdot \Vert _2\) is the induced operator norm defined as \(\Vert A\Vert _2:= \sup \frac{\Vert Ax\Vert _2}{\Vert x\Vert _2}\).

Proof

Let \(\vert \psi _0\rangle \) be the initial state of the circuit. Without loss of generality, we assume the variable unitary gates are labeled as \(U_1(\theta _1)\) to \(U_M(\theta _M)\) according to the order they are applied. Then the output of this circuit can be written as \(\vert \psi _{\varvec{\theta }}\rangle := V_M U_M(\theta _M) \dots V_1 U_1(\theta _1) V_0 \vert \psi _0\rangle \), where \(V_1,\dots ,V_M\) denotes the fixed unitary gates. A measurement outcome can be described by a positive operator-valued measure (POVM) element P with the property that P and \(I-P\) are both positive semidefinite, where I is the identity operator of the output. The probability of this outcome is

$$\begin{aligned} f(\varvec{\theta }):= \textrm{Tr}[P \vert \psi _{\varvec{\theta }}\rangle \langle \psi _{\varvec{\theta }}\vert ] = \langle \psi _{\varvec{\theta }}\vert P \vert \psi _{\varvec{\theta }}\rangle . \end{aligned}$$
(6)

To study the smoothness of \(f(\varvec{\theta })\), we take the partial derivatives:

$$\begin{aligned} \frac{\partial f(\varvec{\theta })}{\partial \theta _j}= & {} \frac{\partial \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j} P \vert \psi _{\varvec{\theta }}\rangle + \langle \psi _{\varvec{\theta }}\vert P \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _j}\\= & {} 2 \Re \left( \frac{\partial \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j} P \vert \psi _{\varvec{\theta }}\rangle \right) \\ \frac{\partial ^2 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k}= & {} 2 \Re \left( \frac{\partial ^2\langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _k} P \vert \psi _{\varvec{\theta }}\rangle + \frac{\partial \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j} P \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _k} \right) \\ \frac{\partial ^3 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k \partial \theta _l}= & {} 2 \Re \left( \frac{\partial ^3\langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _k \partial \theta _l} P \vert \psi _{\varvec{\theta }}\rangle +\frac{\partial ^2\langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _k} P \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _l}\right. \\{} & {} \left. + \frac{\partial ^2 \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _l} P \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _k} + \frac{\partial \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j} P \frac{\partial ^2\vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _k \partial \theta _l} \right) \end{aligned}$$

These partial derivatives can be bounded with vector and operator norms. For a unitary operator U, \(\Vert U\Vert _2=1\). For the POVM element P, \(\Vert P\Vert _2\le 1\). The second-order partial derivatives can be bounded as:

$$\begin{aligned} \left| \frac{\partial ^2 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k}\right|\le & {} 2 \left| \frac{\partial ^2\langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _k} P \vert \psi _{\varvec{\theta }}\rangle + \frac{\partial \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j} P \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _k} \right| \nonumber \\\le & {} 2\left\| \frac{\partial ^2\langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _k}\right\| _2 \Vert P\Vert _2 \Vert \vert \psi _{\varvec{\theta }}\rangle \Vert _2 + 2\left\| \frac{\partial \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j}\right\| _2\nonumber \\{} & {} \times \Vert P\Vert _2 \left\| \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _k}\right\| _2 \nonumber \\\le & {} 2\underset{a,b}{\textrm{max}}\left\| \frac{\partial ^2\vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a \partial \theta _b}\right\| _2 + 2 \underset{a}{\textrm{max}} \left\| \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a}\right\| _2^2 \end{aligned}$$
(7)

Similarly, for the third-order partial derivatives,

$$\begin{aligned} \left| \frac{\partial ^3 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k \partial \theta _l} \right|\le & {} 2 \left| \frac{\partial ^3\langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _k \partial \theta _l} P \vert \psi _{\varvec{\theta }}\rangle + \frac{\partial ^2\langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _k} P \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _l}\right. \nonumber \\{} & {} \left. + \frac{\partial ^2 \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j \partial \theta _l} P \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _k} + \frac{\partial \langle \psi _{\varvec{\theta }}\vert }{\partial \theta _j} P \frac{\partial ^2\vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _k \partial \theta _l} \right| \nonumber \\\le & {} \!2\underset{a,b,c}{\textrm{max}}\left\| \frac{\partial ^3 \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a \partial \theta _b \partial \theta _c}\right\| _2 \!+\! 6\! \left( \underset{a,b}{\textrm{max}}\left\| \frac{\partial ^2 \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a \partial \theta _b}\right\| _2 \right) \nonumber \\{} & {} \! \times \!\left( \underset{a}{\textrm{max}}\left\| \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a }\right\| _2 \right) \end{aligned}$$
(8)

Therefore, to show that the second- and third-order partial derivatives are bounded, it suffices to show that \(\textrm{max}_{a}\left\| \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a }\right\| _2, \textrm{max}_{a,b}\left\| \frac{\partial ^2 \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a \partial \theta _b}\right\| _2\) and \(\textrm{max}_{a,b,c}\left\| \frac{\partial ^3 \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a \partial \theta _b \partial \theta _c}\right\| _2\) are all bounded. We observe that \(\frac{\partial U_j(\theta _j)}{\partial j} = \frac{\partial \textrm{exp}(i\theta _j H_j)}{\partial j} = iH_j U_j(\theta _j)\), and we have

$$\begin{aligned} \underset{a}{\textrm{max}}\left\| \frac{\partial \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a }\right\| _2= & {} \underset{a}{\textrm{max}}\left\| V_M U_M(\theta _M)\dots V_a i H_a U_a(\theta _a)\right. \nonumber \\{} & {} \left. \dots U_1(\theta _1)V_0 \vert \psi _0\rangle \right\| _2 \nonumber \\\le & {} \!\underset{a}{\textrm{max}} \left\| V_M U_M(\theta _M)\! \dots \!V_a \right\| _2 \!\Vert H_a\Vert _2 \left\| U_a(\theta _a)\!\right. \nonumber \\{} & {} \left. \dots U_1(\theta _1)V_0 \vert \psi _0\rangle \right\| _2 \nonumber \\= & {} \underset{a}{\textrm{max}} \Vert H_a\Vert _2 = H_{\textrm{max}} \end{aligned}$$
(9)

With similar derivations, we can obtain \(\textrm{max}_{a,b}\left\| \frac{\partial ^2 \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a \partial \theta _b}\right\| _2^2 \le H_{\textrm{max}}^2\) and \(\textrm{max}_{a,b,c}\left\| \frac{\partial ^3 \vert \psi _{\varvec{\theta }}\rangle }{\partial \theta _a \partial \theta _b \partial \theta _c}\right\| _2 \le H_{\textrm{max}}^3\). According to Eqs. (7) and (8), we obtain

$$\begin{aligned} \left| \frac{\partial ^2 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k}\right|\le & {} 4 H_{\textrm{max}}^2 \end{aligned}$$
(10)
$$\begin{aligned} \left| \frac{\partial ^3 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k \partial \theta _l} \right|\le & {} 8 H_{\textrm{max}}^3 \end{aligned}$$
(11)

Consider \(\frac{\partial f(\varvec{\theta })}{\partial \theta _j}\) as a function of \(\varvec{\theta }\). Since its partial derivatives \(\frac{\partial ^2 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k}\) are bounded by \(4 H_{\textrm{max}}^2\), according to Lemma 2, \(\frac{\partial f(\varvec{\theta })}{\partial \theta _j}\) is \(L_1'\)-continuous with \(L_1'=4 \sqrt{M} H_{\textrm{max}}^2\). Then

$$\begin{aligned} \Vert \nabla f(\varvec{\alpha }) - \nabla f(\varvec{\beta }) \Vert _2\le & {} \sqrt{M}\underset{j}{\textrm{max}} \left| \frac{\partial f(\varvec{\alpha })}{\partial \theta _j} - \frac{\partial f(\varvec{\beta })}{\partial \theta _j} \right| \nonumber \\\le & {} \sqrt{M} L_1' \Vert \alpha - \beta \Vert _2 = 4 M H_{\textrm{max}}^2 \Vert \alpha - \beta \Vert _2 \end{aligned}$$
(12)

and therefore f is \(L_1\)-smooth with \(L_1 = 4 M H_{\textrm{max}}^2\). Similarly, consider \(\frac{\partial ^2 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k}\) as a function of \(\varvec{\theta }\). Since its partial derivatives \(\frac{\partial ^3 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k \partial \theta _l}\) are bounded by \(8 H_{\textrm{max}}^3\), according to Lemma 2, \(\frac{\partial ^2 f(\varvec{\theta })}{\partial \theta _j \partial \theta _k}\) is \(L_2'\)-continuous with \(L_2'=8\sqrt{M} H_{\textrm{max}}^3\). Then

$$\begin{aligned} \Vert \nabla ^2 f(\varvec{\alpha }) - \nabla ^2 f(\varvec{\beta }) \Vert _2\le & {} M \underset{j,k}{\textrm{max}} \left| \frac{\partial ^2 f(\varvec{\alpha })}{\partial \theta _j \partial \theta _k} - \frac{\partial ^2 f(\varvec{\beta })}{\partial \theta _j \partial \theta _k} \right| \nonumber \\\le & {} M L_2' \Vert \alpha \!-\! \beta \Vert _2 \!=\! 8 M^{3/2} H_{\textrm{max}}^3 \Vert \alpha \!-\! \beta \Vert _2 \end{aligned}$$
(13)

and therefore f is \(L_2\)-second-order smooth with \(L_2 = 8 M^{3/2} H_{\textrm{max}}^3\). \(\square \)

Based on above lemma, we obtain the following theorem:

Theorem 3

(Smoothness) The loss function \(\mathcal {L}(\varvec{\theta })\) in the training of QAEGate model is \(L_1\)-smooth and \(L_2\)-second-order smooth.

Proof

Let \(p(\varvec{\theta })\) be the probability of outcome \(\vert +\rangle \) in the SWAP test when the gates are parameterized with \(\varvec{\theta =(\theta _{\textrm{le}},\theta _{\textrm{re}},\theta _{\textrm{ld}},\theta _{\textrm{rd}})}\). Because of \(\mathcal {L}(\varvec{\theta }) = 1-f(\varvec{\theta })\) and \(f(\varvec{\theta }) = 2p(\varvec{\theta })-1\), we just need to discuss the smoothness of \(p(\varvec{\theta })\). Since \(p(\varvec{\theta })\) is the probability of a measurement outcome of a unitary circuit, and in this circuit, every variable gate has the form of \(U_\theta = \textrm{exp}(i\theta H)\), we can apply Lemma 3. Thus, \(p(\varvec{\theta })\) is \(L_1\)-smooth and \(L_2\)-second-order smooth where \(L_1 = 4MH_{\textrm{max}}^2, L_2 = 8M^{3/2}H_{\textrm{max}}^3\) and M is the number of variable gates in the circuit. To prove Theorem 3, we only need to show that \(H_{\textrm{max}}\) is bounded.

The variable gates in our circuit only includes the following single- and two-qubit gates:

$$\begin{aligned} R_X(\theta ):= & {} \textrm{exp}(i\theta \sigma _x),\ R_Y(\theta ):= \textrm{exp}(i\theta \sigma _y),\ R_Z(\theta )\nonumber \\{} & {} := \textrm{exp}(i\theta \sigma _z) \end{aligned}$$
(14)
$$\begin{aligned} XX(\theta ):= & {} \textrm{exp}(i\theta \sigma _x\otimes \sigma _x),\ YY(\theta ):=\textrm{exp}(i\theta \sigma _y\otimes \sigma _y),\nonumber \\{} & {} ZZ(\theta ):=\textrm{exp}(i\theta \sigma _z\otimes \sigma _z) \end{aligned}$$
(15)

Therefore, \(H_{\textrm{max}} = \textrm{max}\{\Vert \sigma _x\Vert _2,\Vert \sigma _y\Vert _2,\Vert \sigma _z\Vert _2, \Vert \sigma _x\otimes \sigma _x\Vert _2, \Vert \sigma _y\otimes \sigma _y\Vert _2,\Vert \sigma _z\otimes \sigma _z\Vert _2\} = 1\). \(\square \)

Proof

(Proof of Theorem 1) Based on Theorem 3, we have proved that \(\mathcal {L}(\varvec{\theta })\) is \(L_1\)-smooth, where \(L_1 = 4M\). In our implementation, there is only one variable for each variable unitary gate, so \(M=\dim {\varvec{\theta }} = \mathcal {O}(n^2)\) here. Then we can apply Lemma 1 and obtain that the convergence rate is \(T=\mathcal {O}(\frac{n^2}{\epsilon ^4})\) for finding \(\mathbb {E}[\Vert \nabla _{\varvec{\theta }} \mathcal {L}(\varvec{\theta }) \Vert ^2]\le \epsilon ^2\). In order to estimate the gradients, we have to compute the loss function for \(\mathcal {O}(\dim {\varvec{\theta }}) = \mathcal {O}(n^2)\) times. The total time complexity of the algorithm is \(\mathcal {O}(\frac{n^4}{\epsilon ^4})\). \(\square \)

1.2 B. Details of numerical experiments

Data generation

For each class of gates U(t) we considered in different scenarios, we randomly generate 60 different \(U(t_i), t_i \in [0,2]\) and split them into the training set and the test set with ratio 50 : 10.

Hyperparameters

We set the learning rate \(\eta \) as 0.005, the maximum number of iterations K as 150 and the threshold \(\delta \) as 0.999 in all experiments.

Impelementation details

We implement our proposed model QAEGate based on Tensorflow Quantum (Broughton et al. 2020). The parameterized quantum circuits are constructed by the tools provided in Cirq (2022) and the gradients are estimated by the differentiators based on parameter shift rule (Mitarai et al. 2018).

Training details

We employed stochastic gradient descent to train all the models in our experiments, using a single GPU. The training time for all the experiments presented in this paper is less than one hour.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, Y., Bai, G., Wang, Y. et al. Quantum autoencoders for communication-efficient cloud computing. Quantum Mach. Intell. 5, 27 (2023). https://doi.org/10.1007/s42484-023-00112-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42484-023-00112-5

Keywords

Navigation