Quantum Orbital Minimization Method for Excited States Calculation on Quantum Computer

Page created by Casey Mcgee
 
CONTINUE READING
Quantum Orbital Minimization Method for
                                                          Excited States Calculation on Quantum
                                                                         Computer
arXiv:2201.07963v2 [physics.chem-ph] 10 Jun 2022

                                                                       Joel Bierman,† Yingzhou Li,∗,‡ and Jianfeng Lu∗,¶,†,§
                                                                               †Department of Physics, Duke University
                                                                          ‡School of Mathematical Sciences, Fudan University
                                                                            ¶Department of Mathematics, Duke University
                                                                              §Department of Chemistry, Duke University

                                                                         E-mail: yingzhouli@fudan.edu.cn; jianfeng@math.duke.edu

                                                   Abstract                                               quantum computing is demonstrated by sev-
                                                                                                          eral groups. 1,2 One of the most promising ap-
                                                   We propose a quantum-classical hybrid vari-            plications for noisy intermediate-scale quan-
                                                   ational algorithm, the quantum orbital mini-           tum (NISQ) devices is the variational quan-
                                                   mization method (qOMM), for obtaining the              tum eigensolver (VQE) under the full config-
                                                   ground state and low-lying excited states of           uration interaction (FCI) framework. 3–8 FCI is
                                                   a Hermitian operator. Given parameterized              a quantum chemistry method which discretizes
                                                   ansatz circuits representing eigenstates, qOMM         the time-independent many-body Schrödinger
                                                   implements quantum circuits to represent the           equation numerically exactly. The ground state
                                                   objective function in the orbital minimization         and low-lying excited states are calculated via
                                                   method and adopts a classical optimizer to min-        solving a standard eigenvalue problem, while
                                                   imize the objective function with respect to the       the problem dimension scales exponentially as
                                                   parameters in ansatz circuits. The objective           the number of electrons in the system. The
                                                   function has orthogonality constraint implic-          FCI framework fits naturally into quantum
                                                   itly embedded, which allows qOMM to apply              computing. Quantum algorithms for ground
                                                   a different ansatz circuit to each input refer-         state eigensolvers under FCI framework have
                                                   ence state. We carry out numerical simulations         been extensively developed in past decades. 9–13
                                                   that seek to find excited states of H2, LiH, and        Among them, VQE is a quantum-classical hy-
                                                   a toy model consisting of 4 hydrogen atoms ar-         brid method and has the shortest circuit depth,
                                                   ranged in a square lattice in the STO-3G basis         which makes it the most widely applied ground
                                                   with UCCSD ansatz circuits. Comparing the              state quantum eigensolver on NISQ devices. In
                                                   numerical results with existing excited states         short, VQE adopts a parameterized circuit as
                                                   methods, qOMM is less prone to getting stuck           the variational ansatz |φθ i for the ground state.
                                                   in local minima and can achieve convergence            Then the energy function hφθ |H|φ  b θ i is mini-
                                                   with more shallow ansatz circuits.                     mized with respect to the parameter θ, where
                                                                                                          Hb is the Hamiltonian operator.
                                                                                                            In addition to the ground state, excited states
                                                   1    Introduction                                      play an important role in connecting com-
                                                   The development of quantum computing has               putational results and experimental observa-
                                                   boomed in recent years. The supremacy of               tions in quantum chemistry. In this paper,

                                                                                                      1
we propose a hybrid quantum-classical algo-            Then (H   b − λI)2 is used as the operator in VQE
rithm, named quantum orbital minimization              method, and an excited state with the energy
method (qOMM), to compute the excited state            closest to λ is obtained via the optimization
energies as well as the corresponding states un-       procedure in VQE. Witness-assisted variational
der FCI framework.                                     eigenspectra solver 19,20 adopts an entropy term
  Existing approaches for excited states on            as the regularizer in the objective function to
quantum computers could be grouped into two            keep the minimized excited state close to its ini-
categories: the excited state energy computa-          tial state, where the initial state is set to be an
tion and the excited state vector computation.         excitation from the ground state. Quantum de-
  Excited state energy computation on quan-            flation method 21,22 adopts the overlapping be-
tum computer evaluates the excited state en-           tween the target state and the ground state as
ergy without explicitly constructing the state         a penalty term in the objective function to keep
vector. Two methods, quantum subspace ex-              the target state as orthogonal as possible to the
pansion (QSE) 14–16 and quantum equation of            ground state. The extension to more than one
motion (qEoM), 17 are representatives of this          excited state in the quantum deflation method
category. Both of them are based on pertur-            is straightforward. All these methods above
bation theory starting from the ground state.          have their classical algorithm counterparts. On
They first construct the ground state |φGS i via        the other hand, the following two methods use
VQE. Then they evaluate the expansion values           a unique property of quantum computing, i.e.,
on single excited states from the ground state         all quantum operators are unitary and orthog-
|φGS i, i.e.,                                          onal states after the actions of quantum oper-
                    D                         E        ators are still orthogonal to each other. Sub-
                         a†q2 b
        App21 qq21 = φGS b        b a† b
                              ap2 Hb   a  φ            space search VQE (SSVQE) 23 and multistate
                                     p1 q1 GS
                                                       contracted VQE (MCVQE) 24 apply the same
and                                                    basic ansatz. They first prepare a sequence of
                                                       non-parameterized mutually orthogonal states,
                       a†q2 b
       Bpp12qq12 = φGS b        a†p1 b
                            ap2 b    aq1 φGS           |φ1 i , |φ2 i , . . . , |φK i for K excited states. Then
                                                       a parameterized circuit U            bθ is applied to these
for b
    a†p and bap being creation and annihilation                       b
                                                       states, as Uθ is unitary (implemented on a quan-
operators. Once these expansion values are                                       bθ |φ1 i , . . . , U
                                                                                                    bθ |φK i are mutu-
                                                       tum computer), U
available, QSE solves a generalized eigenvalue
                                                       ally orthogonal. Hence we could minimize the
problem of matrix pencil (A, B) and obtains
                                                       objective function
the refined ground state energy and low-lying
excited state energies on the perturbed space                                XD                         E
                                                                                   φk Ub †H bU  bθ φk
of |φGS i. qEoM adopts these expansion val-                                              θ
                                                                          k
ues into the equation of motion expression and
obtains the low-lying excited state energies.          to obtain approximated ground state and ex-
Double excitations and higher order excitations        cited states. The energies associated with
from the ground state could be further evalu-          bθ |φ1 i , . . . , U
                                                       U                  bθ |φK i are usually not ordered.
ated and included to improve the accuracy of           SSVQE further introduces a min-max problem
QSE and qEoM. However, including more ex-              to obtain a specific excited state. Also, in the
citations leads to a much higher computational         SSVQE paper, a weighted objective function is
cost.                                                  employed to preserve the ordering, which will
  Excited state vector computations include a          be the version we investigate numerically in our
wide spectrum of methods. Folded spectrum              paper below. The MCVQE approach enriches
method 18 folds the spectrum of a Hamiltonian          the ansatz by introducing a unitary matrix ap-
operator via a shift-and-square operation, i.e.,       plied across states in aP     classical post-processing
 b → (H  b − λI)2 , where λ is a chosen num-           step, i.e., |ψk i = U      bθ
H                                                                                      ℓ |φℓ i Vℓk for Vℓk being
ber close to the desired excited state energy.         the (ℓ, k)-th entry of a unitary matrix V . Hence

                                                   2
a standard eigenvalue problem is solved as the                   qEoM as SSVQE and MCVQE. Namely, the
post-processing in MCVQE to obtain V .                           excited states in qOMM are encoded into the
  In this paper, we propose to obtain the ex-                    objective function of an optimization problem
cited state vector using the objective function in               and not as linear combinations of single and
the orbital minimization method (OMM), 25,26                     double excitations above the ground state in a
                                                                 post-processing step. This allows the method to
                               K D
                               X               E
                                        b ψi                     1. compute excited states for which triple and
f (|ψ1 i , . . . , |ψK i) =2         ψi H                        higher excitation terms have a non-negligible
                               i=1
                                                                 contribution without calculating higher order
                                 K
                                 X              D       E
                           −                        b ψi .
                                       hψi |ψj i ψj H            reduced density matrices and 2. treat the
                               i,j=1
                                                                 ground and excited states on the same foot-
                                                       (1)       ing.
                                                                   Notice that the objective function Eq. (1) in-
We refer our method as quantum orbital min-                      cludes the evaluation of the state inner prod-
imization method (qOMM) throughout this                          uct with respect to the Hamiltonian operator
paper. An important property of Eq. (1) is                                                                b j i and
                                                                 and the state inner product, i.e., hφi |H|ψ
that, even though the orthogonality conditions                    hψi |ψj i, respectively. In other existing excited
among |Ψi is are not explicitly enforced, the                    state methods, neither of these two are eval-
minimizer of Eq. (1) consists of mutually or-                    uated directly.1 Given our variational ansatz,
thogonal state vectors. For most other meth-                     we construct quantum circuits to evaluate both
ods, the orthogonality condition is explicitly                   inner products, which are essentially inspired
proposed as a constraint in the optimization                     by the Hadamard test. When the evaluation of
problem. The implicit orthogonality property                     Eq. (1) is available on a quantum computer, we
allows us to adopt a different ansatz (with re-                   can adopt gradient-free optimizers to minimize
gards to either the numerical parameter values,                  the objective function with respect to parame-
the structure of the ansatz, or both) for each ex-               ters, θ1 , . . . , θK .
cited state. In this paper, the variational ansatz                 Finally, we include a sequence of numerical
class that we use is of the form,                                results to demonstrate the efficiency and ro-
                                                                 bustness of the proposed method. We test
                   E XK
                                                                 the following chemistry systems using the pro-
                 e
                 ψk =   bθ |φℓ i Vℓk ,
                        U                              (2)
                          ℓ                                      posed method on quantum simulators: H2 and
                           ℓ=1
                                                                 LiH at their equilibrium configurations, H2 at
where |φ1 i , . . . , |φK i are initial states,                  a stretched bond distance of twice the equilib-
bθ , . . . , U
U             bθ are parameterized circuits with                 rium distance, and a toy model consisting of 4
    1           K
θ1 , . . . , θK being the parameters, Vℓk is the                 hydrogen atoms arranged in a square at near-
(ℓ, k)-th entry of an invertible matrix V that is                equilibrium distances and a stretched bond dis-
applied during a classical post-processing step.                 tance configuration. We observe that qOMM
Such an expression is more general than those                    has two major benefits compared to algorithms
used in other excited state vector methods,                      such as weighted-SSVQE (In this paper, when
which typically either apply the same circuit                    we refer to SSVQE, we implicitly mean the
bθ to all input states or do not apply the
U                                                                weighted version of SSVQE that uses one op-
classical post-processing matrix V . Another                     timization step for finding multiple eigenvalues
feature of Eq. (1) is that the objective function                simultaneously.) that enforce the orthogonality
does not include any hyper-parameter (unlike,                    of the input states at every optimization step.
e.g., the deflation approach), which makes the                    1) Enforcing the mutual orthogonality of the in-
method easy to use in practice without pa-                          1
                                                                     In deflation method, 21,22 the absolute value of the
rameter tuning. Furthermore, this approach                       inner product, | hψi |ψj i| , is evaluated, while we need
shares the same advantages over QSE and                          the actual value of the inner product without taking
                                                                 the absolute value.

                                                             3
put states through the overlap terms allows for           products for the case in which they are different
greater flexibility in the choice of ansatz applied        in this section.
to each input state, as well as greater flexibility          To simplify the notations, we consider eval-
in the choice of input states. The circuit ap-            uating hψ|φi and hψ|H|φi,b      where |ψi and |φi
plied to each input state must be identical in the        are given by a unitary circuit acting on |0i, i.e.,
SSVQE framework in order to enforce orthogo-              |ψi = U   bθ |0i and |φi = U  bθ |0i. Given the
                                                                      ψ                    φ

nality, whereas qOMM allows us to apply a dif-            variational
                                                          D             ansatz,
                                                                         E      the inner  product  hψ|φi =
ferent ansatz circuit to each input state. 2) The           0Ub U
                                                                † b
                                                               θψ θφ 0 can be viewed as the inner prod-
tendency to get stuck in local minima, while not                                                        b =
                                                          uct of |0i with respect to the operator O
eliminated entirely, is diminished considerably.
                                                          b U
                                                            † b
The first benefit is important because choosing             U θψ θφ .    In principle, the transpose of the
                                                          ansatz circuit can be constructed. After the
a suitably expressive ansatz for excited states is
                                                          ansatz circuit is compiled into basic gates, the
often more difficult than the ground state VQE
                                                          transpose of the circuit is the composition of the
problem. In practice, this means that one is
                                                          transposed basic gates in the reverse ordering.
often able to use a more shallow and less ex-
                                                          Hence hψ|φi could be then evaluated using the
pressive ansatz circuit; a feature that is crucial
                                                          Hadamard test, which requires the controlled
for applications on NISQ devices.
                                                          ansatz circuit and the controlled transpose of
  The rest of the paper is organized as follows.          the ansatz circuit. Instead, we adopt an idea
Section 2 illustrates the circuits we use to eval-        inspired by the Hadamard test, which can be
uate the inner products. The overall method                                         b
                                                          applied to evaluate hψ|H|φi     without controlled
is discussed in Section 3. In Section 4, we ap-           transpose of the ansatz circuit. In the following,
ply the method to various chemistry molecules             we discuss such two quantum circuits evaluat-
to demonstrate the efficiency of the proposed               ing the inner products in detail.
method. Finally, Section 5 concludes the paper
and discusses future work.
                                                          2.1     hψ|φi evaluation
                                                          For complex-valued states |ψi and |φi, the in-
2    Inner Product Circuits                               ner product hψ|φi is also a complex number.
                                                          The evaluation of hψ|φi is divided into the eval-
In the objective function of qOMM, the inner              uations of the real and imaginary parts. We
product of two states and the inner product               will introduce the circuit for evaluation of the
of two states with respect to an operator are             real part of hψ|φi in detail, and the circuit for
complex numbers. Both the real and imaginary              the imaginary part could be constructed analo-
parts are explicitly needed to construct the ob-          gously.
jective function. If two states are identical, then         Figure 1 illustrates the precise quantum cir-
the inner products can be efficiently evaluated             cuit used to evaluate the real part of hψ|φi.
by the well known Hadamard test. We propose               Here we list expressions at all five stages as
quantum circuits for the evaluation of the inner          shown in the figure:

                                                      4
                 1        
                                               ~
                                   H ⊗ I |0i |0i = √ |0i + |1i |~0i ,                                    (S1 )
                                                     2
                              1                   1                
                     C-Ubθ √ |0i + |1i |~0i = √ |0i |~0i + |1i |φi ,                                     (S2 )
                           φ
                                2                    2
                      1                         1                
                 X ⊗ I √ |0i |~0i + |1i |φi = √ |1i |~0i + |0i |φi ,                                     (S3 )
                          2                          2
                         1                        1                
                    bθ √ |1i |~0i + |0i |φi = √ |1i |ψi + |0i |φi ,
                  C-U                                                                                    (S4 )
                      ψ
                          2                          2
                      1                       1                           
                H ⊗I    √      |1i |ψi + |0i |φi = |0i |φi + |ψi + |1i |φi − |ψi ,                       (S5 )
                          2                        2

where H is the Hadamard gate, X is the Pauli-                        H                               H
                                     b is the
X gate, I is the identity mapping, C-U
           b
controlled U gate. Conducting the measure-
ment on the ancilla qubit, we have the prob-
ability measuring zero being,

 P(measurement = 0)
                                                          Figure 2: Extended inner product circuit.
                 1
               =    hφ| + hψ| |φi + |ψi
                 4
                 1 1                                      the real part in Eq. (3) corresponds to the imag-
               = + Re hψ|φi ,
                 2 2                                      inary part of the inner product instead. One
                                        (3)               way to implement the replacement is to add
                                                          a phase gate on the ancilla qubit after stage
where Re(·) denotes the real part of the com-
                                                          Eq. (S2 ). The resulting quantum circuit is as
plex number. Hence, if we execute the quan-
                                                          that in Figure 2 where the light blue dash box
tum circuit as in Figure 1 for M shots and
                                                          is enabled. We then need to execute the circuit
count the number of zeros, the Re hψ|φi could
                                                          for another M shots to obtain the approxima-
be well-approximated given M (which will scale
                                                          tion of Im hψ|φi.
as O( ǫ12 ) for some target accuracy ǫ due to sta-                                                    bθ |0i
                                                            In addition to variational ansatz |ψi = U
tistical sampling noise) is reasonably large.                                                           ψ

                                                          and |φi = U  bθ |0i, we could extend the quantum
                                                                         φ

                                                          circuit to variational ansatz |ψi = Ubθ Qb |0i and
                                                                                                 ψ
          H                                H
                                                          |φi = U  bθ Qb |0i, where Qb is a common quan-
                                                                     φ
                                                          tum circuit preparing the initial state. If such
                                                          ansatzes are adopted, we could enable the light
                                                          pink dash box in Figure 2 such that the circuit
                                                          Qb is applied on |0i as a common circuit. The
                                                          real and imaginary parts of the inner product
              S1      S2     S3       S4       S5         are evaluated similarly as before.
        Figure 1: Inner product circuit.
                                                          2.2       b
                                                                 hψ|H|φi evaluation
  The imaginary part of hψ|φi could be esti-
                                                          The Hamiltonian operator considered in this
mated in a similar way. Notice that the real
                                                          paper is under FCI framework and admits the
part in Eq. (3) comes from hψ|φi + hφ|ψi. If ei-
ther |ψi or |φi is replaced by ı |ψi or ı |φi, then

                                                      5
form,                                                                     ear algebra form,
                                                                                                              
           N
           X orb                    N
                                    X orb                                                                  b ,
                                                                                    min tr (2I − X ⋆ X)X ⋆ HX           (7)
    b=
    H                  a†pb
                   hpq b  aq +                    a†pb
                                             vpqrsb  a†q b ar , (4)
                                                         asb                      X∈CN×K

          p,q=1                  p,q,r,s=1
                                                                          where X is a multi-column vectors with each
where hpq , vpqrs are one-body and two-body in-                           column denoting a state vector, Hb is the Hamil-
tegrals respectively. The Hamiltonian operator                            tonian matrix, I is the identity matrix of size
 b in general, is not a unitary operator. Hence,
H,                                                                        K × K, and tr(·) denotes the trace operator.
it cannot be represented by a quantum circuit                             When H b is Hermitian negative definite, all lo-
directly. On the other hand, the excitation op-                           cal minima of Eq. (7) are of the form, 27
erators, b  aq and b
         a†pb       a†pb
                       a†q b ar , are unitary and can
                           asb
be represented by quantum circuits. The de-                                                   X = QV,                   (8)
tailed circuits of these excitation operators de-
                                                                          where Q ∈ CN ×K are K eigenvectors of H     b as-
pend on the encoding scheme. All methods pro-
posed in this paper do not rely on the encoding                           sociated with the smallest K eigenvalues, and
scheme. The value hψ|H|φi    b      is then calculated                    V ∈ RK×K is an arbitary unitary matrix. Sub-
as a weighted summation of circuit results,                               stituting Eq. (8) back into Eq. (7), we find that
                                                                          all these local minima are of the same objective
    D       E NXorb                                                       value and they are global minima with minimiz-
         b
        ψH φ =            a†pb
                    hpq ψ b  aq φ                                         ers consisting of mutually orthogonal vectors.
                      p,q=1                                               This property has been proven analytically. 27
                            N
                            X orb                                         Unlike many other optimization problems for
                      +                       a†pb
                                      vpqrs ψ b  a†q b
                                                     asb
                                                       ar φ .             solving the low-lying eigenvalue problem, OMM
                          p,q,r,s=1                                       as in Eq. (7) does not have any explicit orthog-
                                                               (5)        onal constraint and the orthogonality in Eq. (8)
                                                                          could be achieved without any orthogonaliza-
Thus it suffices to consider circuits for the eval-
                b |φi, where Ub denotes a general                         tion step. This property is the key reason for
uation of hψ|U
                                                                          us to investigate the application of Eq. (7) in
unitary operator.
                                             b |φi.                       quantum computing, where explicit orthogonal-
  Figure 3 illustrates the circuit for hψ|U
                                                                          ization is difficult to implement in the presence
This circuit can be understood in the same way
                                         b |φi can                        of hardware and environmental noise. Further-
as the circuit of hψ|φi. The value hψ|U
                                                                          more, despite the fact that minimizing Eq. (7) is
be explicitly written as,
                                                                          a non-convex problem, its property that all lo-
  D        E                                                          cal minimizers are also global minimizers which
    ψU b φ = h0| U    b†    UbU
                              bθ |0i = hψ|φi  e ,
                       θψ       φ                                         encode mutually orthogonal vectors in the case
                                       E        (6)                       where we are minimizing over all possible vec-
where the construction circuit for φe is U   bUbθ .
                                                 φ
                                                                          tors 27 (absent the constraint of vectors express-
Hence Figure 3 is essentially the circuit in Fig-                         able as quantum circuits) makes it a compelling
ure 1 and Figure 2 with a modified constructing                            objective function to investigate in the quantum
circuit for |φi.                                                          setting. The numerical simulations in this work
                                                                          serve to provide some insight into the extent to
                                                                          which these favorable properties in the classi-
3        Quantum Orbital Mini-                                            cal setting carry over to the quantum setting
                                                                          where we are constrained by vectors express-
         mization Method                                                  able as quantum circuit ansatzes.
Orbital minimization method originates in the                                In the quantum computing setting, each col-
field of density functional theory (DFT). The                              umn of X is represented by the ansatz |ψi i =
                                                                          bθ |φi i. By the nature of quantum computing,
                                                                          U
minimization problem admits the following lin-                                i
                                                                          if quantum error is not taken into account, |ψi i

                                                                      6
H                                                H

                          Figure 3: Extended inner product circuit with a unitary gate.

is always of unit length for all i. Hence the                            the value is not known in advance), then we
2I − X ⋆ X term in Eq. (7) has one on its diag-                          are confident that the ansatz is rich enough
onal. Substituting the variational ansatz into                           and the optimizer works perfectly in finding
Eq. (7), we obtain the optimization problem for                          the global minimum for parameters θ1 , . . . , θK .
our quantum excited state problem,                                       However, we lack the theoretical understand-
                                                                         ing of the energy landscape of Eq. (9). In part,
                    min g(θ1 , . . . , θK )                  (9)         the dependence of g on θ1 , . . . , θK , like that
                   θ1 ,...,θK
                                                                         in the deep neural network, is highly nonlin-
for                                                                      ear and nonconvex. It could be analyzed for
                                                                         some simple ansatz circuit. However, for most
    g(θ1 , . . . , θK )                                                  complicated circuits, the analysis remains diffi-
    XK D                      E                                          cult. On the other hand, from the linear alge-
  =         φi U  b† H bU
                        bθ φi                                            bra point of view, when the variational space is
                    θi    i
      i=1                                                                rich enough, we could view the problem as op-
          K D
          X                       ED             E                       timizing the vectors X directly in Eq. (7). The
      −              b† U
                  φi U  b              b† HbU
                                            bθ φi .
                      θi θj φ j     φj Uθj    i                          nature of quantum computer poses unit length
          i,j=1
           i6=j
                                                                         constraints on each column of X. Adding ex-
                                                         (10)            tra unit length constraints would dramatically
                                                                         change the energy landscape of the original
  If the variational space is rich enough such                           problem Eq. (7). We defer the theoretical un-
that vectors in QV as in Eq. (8) can be well-                            derstanding of Eq. (9) to future work. In this
                   
approximated by U    bθ |φi i K , then qOMM                              paper, we focus on the numerical performance
                       i      i=1
can achieve a good approximation to the global                           of Eq. (9).
minimum of its linear algebra counterpart                                  For numerical optimization of the objec-
Eq. (7). In practice, if we obtain the global                            tive function g, note that its gradient with
minimum value of Eq. (7) for qOMM (though                                respect to a parameter θiα is given by

                        *                +           *                +
               dg            b†
                            dU                   XK       b†
                                                         dU             D              E
                               θi b b                       θi b            b† HbU
                                                                                 bθ φi
                   =2 Re φi      H Uθj φj − 2 Re      φi       Uθj φj    φj Uθj    i
              dθiα          dθiα                 j=1
                                                         dθiα
                                                                       j6=i
                                  K D
                                                             *                      +                                  (11)
                                  X                      E              b
                         − 2 Re             b† U
                                         φi U  b                 b† Hb dUθi φi
                                             θi θj φ j        φj Uθj                    ,
                                  j=1
                                                                       dθiα
                                  j6=i

                                                                   7
that in MCVQE. We construct two matrices
where θiα is the α-th parameter in θi . From                with their (i, j)-th elements being,
Eq. (11), we notice that the gradient evalua-                                D                E
tion requires an efficient quantum circuit for                                      b † bb
                                                                     Aij = φi Uθi H Uθj φj and
 bθ
dU i
     . The existence of such quantum circuits                                       D            E  (12)
dθiα
                                                                             Bij = φi U b† Ubθ φj .
is ansatz dependent. Assume that we can eval-                                            θi   j

uate Eq. (11) for all parameters.Then the up-
                                                            Then a generalized eigenvalue problem with
date on parameters follows a gradient descent
                                                            matrix pencil (A, B) is solved as,
method as
                                                                               AR = BRΛ,                     (13)
            θ(t+1) = θ(t) − τ ∇θ g(θ(t) ),
                                                            where R denotes the eigenvectors and Λ is a di-
for θ(t+1) and θ(t) being all parameters at (t+1)-
                                                            agonal matrix with eigenvalues of (A, B) being
th and (t)-th iteration respectively, and τ is
                                                            its diagonal entries. The desired eigenvalues of
the stepsize. Since all evaluations of quantum               b are in Λ and the corresponding eigenvectors
                                                            H
circuits involve randomness, in practice, this                      P
                                                                           b
                                                            admits K   j=1 Uθj |φj i Rji for i = 1, . . . , K. The
amounts to using the stochastic gradient de-
                                                            eigenvector, which is the excited state of H,       b
scent method. Other advanced first order meth-
ods recently developed in deep neural network               can be explicitly constructed as a linear com-
literature, e.g., ADAM, AdaGrad, coordinate                 bination of unitary operators 31 on a quantum
descent method, etc., could be used to acceler-             computer.
ate. These methods could achieve better per-                  qOMM has several unique features com-
formance than the vanilla stochastic gradient               pared to existing excited state vector meth-
descent method.                                             ods. The targets of folded spectrum method 18
   When the gradient circuit is not available,              and witnessing eigenspectra solver 19,20 are dif-
we can adopt zeroth-order optimizers to solve               ferent from that of qOMM. Quantum deflation
Eq. (9). In this work, we use L-BFGS-B 28 as                method 21,22 addresses the excited states one-by-
the optimizer, which uses a 2-point finite differ-            one, and the optimization problems therein are
ence gradient approximation and thus does not               of different difficulty as that in qOMM. The
require explicit implementation of the circuit              most related excited state vector methods to
      b
     dU                                                     qOMM are SSVQE and MCVQE. Since SSVQE
for dθiα
       θi
          . We find this optimizer to work well
                                                            and MCVQE are very similar to each other,
in the noiseless simulations we consider here,
                                                            we focus on the difference among qOMM and
but we note that other promising candidates
                                                            SSVQE in this paper. Both the qOMM and
for noisy simulations and experimental calcula-
                                                            weighted SSVQE algorithms 23 take a set of in-
tions exist, which could also be used for our pro-
                                                            put states and seek to simultaneously optimize
posed objective function. In particular, using
                                                            the parameterized circuit and obtain the low-
simultaneous perturbation stochastic approxi-
                                                            lying eigenvalues and eigenvectors of the Her-
mation (SPSA) for small molecule VQE calcu-
                                                            mitian operator. Both methods achieve their
lations has been demonstrated experimentally
                                                            goal via solving a single optimization problem.
on superconducting hardware 29 and a quantum
                                                            There are two major differences between them.
natural gradient version of SPSA also exists. 30
                                                            The first one is how the mutual orthogonality of
  When the optimization problem is solved, we
                                                            the eigenvectors is enforced. The construction
will obtain a set of parameters θ1 , . . . , θK such
      bθ |φ1 i , . . . , U
                         bθ |φK i approximate vectors       of the weighted SSVQE objective function en-
that U  1                  K
                                                            sures that the set of states being evaluated is
in QV as in Eq. (8). Numerically, we find
                                                            mutually orthogonal throughout optimization
that V is very close to an identity matrix,
                                                            iteration. The proposed qOMM implicitly em-
while a post-processing procedure still improves
                                                            beds this constraint into the objective function
the accuracy of the approximation. The post-
                                                            such that the states being evaluated are only
processing procedure is slightly different from

                                                        8
guaranteed to be mutually orthogonal at the               noiseless simulations and the Parity mapping
global minimum. The first difference leads to               for the noisy simulations. The relative accu-
the second major difference. SSVQE adopts the              racies of the results for both methods are cal-
ansatz circuits for different states that must ad-         culated by comparing them to their numeri-
mit mutual orthogonal property. Hence, they               cally exact counterparts obtained by numeri-
restrict their ansatz circuits to be a set of iden-       cally exact diagonalization of the Hamiltoni-
tical parameterized circuits acting on mutually           ans. Circuits involved in noiseless qOMM sim-
orthogonal initial states. qOMM does not con-             ulations were conducted using Qiskit’s Stat-
strain the ansatz circuits in this way. qOMM              eVector simulator. Circuits involved in noise-
could adopt as its set of parameterized circuits          less SSVQE simulations were conducted using
any combination of ansatz circuits that have              Qiskit’s QasmSimulator in conjunction with
been tested in the literature, e.g., ansatz cir-          the AerPauliExpectation method for computing
cuits in MCVQE 24 and SSVQE, 23 ansatz cir-               expectation values in order to improve runtime
cuits in deflation method, 21,22 etc. In addi-             performance. Both of these methods produce
tion to these ansatz circuits, qOMM could also            ideal, noiseless results. Simulations for finding
adopt other ansatz circuits not been tested by            the energy levels of H2 were performed at an in-
excited state methods. In this work, we adopt             teratomic distance of 0.735 Angstroms, those of
UCCSD 32 as our main ansatz circuit so that the           LiH were performed at an interatomic distance
particle number is preserved throughout.                  of 1.595 Angstroms, and those of the hydro-
                                                          gen square lattice model were performed at an
                                                          interatomic distance of 1.23 Angstroms. Sim-
4    Numerical Results                                    ulation results for the hydrogen square model
                                                          and H2 at stretched bond distances can also be
We present the numerical results of our pro-
                                                          found in Appendix A. For the LiH simulations,
posed method obtained from a classically sim-
                                                          we freeze the two core electrons to reduce the
ulated quantum computer. We apply it to
                                                          problem size from 12 qubits to 10 qubits.
the problem of finding low-lying eigenvalues
                                                            The set of initial states for both algorithms
of the electronic structure Hamiltonian for
                                                          and all molecular Hamiltonians were chosen
near-equilibrium configurations of two differ-
                                                          to be the Hartree-Fock state and low-lying
ent molecules as well the toy model consisting
                                                          single-particle excitations above it. For ex-
of 4 hydrogen atoms arranged in a square lat-
                                                          ample, if we wish to find three energy lev-
tice. The numerical tests in this section serve
                                                          els for 4-qubit H2, in the Jordan-Wigner map-
to explore what advantages or disadvantages
                                                          ping this would be the set: {|01iα |01iβ ,
may arise for each of the qOMM and weighted
                                                          |01iα |10iβ , |10iα |01iβ }. For LiH, this would be
SSVQE approaches (using weight vectors of the
                                                          the set: {|00001iα |00001iβ , |00001iα |00010iβ ,
form [n, n − 1, ..., 1] for finding n states) in
                                                          |00010iα |00001iβ }, where the subscript α and
different situations. Section 4.1 presents the re-
                                                          β denote the spin group 2 . Given that the
sults for H2 and the hydrogen square model ob-
                                                          UCCSD 32 ansatz preserves the numbers of spin-
tained from noiseless simulations. Section 4.2
                                                          up and spin-down, the above choice of ini-
presents the results obtained from simulating
                                                          tial states constrains these variational quan-
H2 with a depolarizing noise model. Section 4.3
                                                          tum algorithms to search only the desired sub-
presents the results for LiH obtained from noise-
                                                          space, lessening the computational difficulty
less simulations.
  All codes for these simulations are imple-                 2
                                                              In the Qiskit implementation of n-qubit basis states
mented using the Qiskit 33 library. The elec-             in the Jordan-Wigner representation, the first n2 qubits
                                                          encode spin-up (α) and the second n2 qubits encode spin-
tronic structure Hamiltonians were generated
                                                          down (β). Thus we have chosen these three basis states
using molecular data from PySCF 34 in the                 because they are elements of the 2-particle, spin mag-
STO-3G basis and mapped to qubit Hamiltoni-               netization Sz = 0 subspace of the full 2n -dimensional
ans using the Jordan-Wigner mapping for the               Fock space.

                                                      9
Table 1: The success rate of given problem instances converging to within a relative accuracy of
10−5 for a given number of UCCSD repetitions using a randomized ansatz parameter initialization.

          Molecule                 Algorithm                            UCCSD Repetitions
                                                      1-rep     2-rep   3-rep    4-rep   5-rep   6-rep    7-rep
                               qOMM (3 states)        100%       −        −       −        −       −       −
        H2 (0.735 Å)
                               SSVQE (3 states)        0%       70%     100%      −        −       −       −
                               qOMM (3 states)        100%       −        −       −        −       −       −
        H2 (1.47 Å)
                               SSVQE (3 states)        0%       90%     100%      −        −       −       −
                               qOMM (2 states)            0%    100%      −   −            −       −       −
                               qOMM (3 states)            0%    100%      −   −            −       −       −
 Hydrogen square (1.23 Å)
                               SSVQE (2 states)            −     0%     100%  −            −       −       −
                               SSVQE (3 states)            −      −      0% 100%           −       −       −
                               qOMM (2 states)            0%     90%      −   −            −       −       −
                               qOMM (3 states)            0%    100%      −   −            −       −       −
 Hydrogen square (2.46 Å)
                               SSVQE (2 states)            −     0%     100%  −            −       −       −
                               SSVQE (3 states)            −      −      0% 100%           −       −       −
                               qOMM (2 states)            0%    100%      −        −   −    −    −
                               qOMM (3 states)            0%    100%      −        −   −    −    −
                               qOMM (4 states)             −    100%      −        −   −    −    −
                               qOMM (5 states)             −    100%      −        −   −    −    −
                               qOMM (6 states)             −    100%      −        −   −    −    −
                               qOMM (7 states)             −    100%      −        −   −    −    −
       LiH (1.595 Å)
                               SSVQE (2 states)           0%     80%      −        −   −    −    −
                               SSVQE (3 states)           0%     0%      70%       −   −    −    −
                               SSVQE (4 states)            −      −      0%      100%  −    −    −
                               SSVQE (5 states)            −      −       −      10% 100%   −    −
                               SSVQE (6 states)            −      −       −        −  10% 100%   −
                               SSVQE (7 states)            −      −       −        −   −   60% 100%

and producing physically meaningful results at            states after the initial ansatz circuit application
the same time. For this reason, UCCSD is our              remain the Hartree-Fock state and low-lying
ansatz of choice for these simulations. If we             single-particle excitations above the Hartree-
wish to find only two energy levels, we would              Fock states. For most molecules, these states
omit the highest-energy state from the set. This          from Hartree-Fock calculation are good approx-
choice of the set of states allows us to study two        imations of the ground state and low-lying ex-
different types of algorithm initializations: one          cited states under FCI. The optimization from
in which all of the ansatz parameters are ran-            the “zero vector” initialization could be viewed
domly sampled according to a uniform distribu-            as a local optimization and improves the perfor-
tion on [−2π, 2π) and another one in which they           mance of optimizers upon the random initial-
are all set to zero. With the random parame-              ization. In Qiskit’s implementation of ansatz
ter initialization, we can study the robustness           circuits such as UCCSD, one can increase the
of the algorithms with respect to their start-            expressiveness of the ansatz by repeating the
ing point in the parameter space. With the                corresponding quantum circuit block pattern r
“zero vector” initialization, the UCCSD ansatz            times, which comes at the cost of both the cir-
circuit is initialized to the identity and the            cuit depth and the number of variational pa-

                                                     10
rameters increasing by a factor of r. In this pa-         SSVQE gets stuck in a local minimum and does
per, we refer to an ansatz circuit that consists          not converge regardless of the ansatz circuit
of the UCCSD circuit block pattern repeated               depth used. In both figures and all later conver-
r times as r-repetition UCCSD (or simply r-               gence figures in this paper, we depict the con-
UCCSD). We run both algorithms for several                vergence curve as the relative error (defined as
different numbers of repetitions in order to ac-           |fexact −fmeasured |
                                                                |fexact |
                                                                               ) against the number of the ob-
count for the fact that the necessary ansatz ex-          jective function evaluations, which is considered
pressiveness for convergence is not known a pri-          the most expensive operation in VQE-type al-
ori. Such a study allows us to explore the de-            gorithms. The L-BFGS-B implementation used
pendence of convergence success on the circuit            in this paper uses a 2-point finite difference
depth for each algorithm.                                 method such that each parameter is perturbed
  When using a random initialization, we run              slightly from a given reference point common
each algorithm ten times for each setting. In             to all of them. Thus for an n-parameter ob-
Sections 4.1 and 4.3, for each setting we plot one        jective function, n + 1 function evaluations are
run roughly representative of the average con-            required for each parameter update step in or-
vergence. In the noisy results in Section 4.2, we         der to estimate the gradient. Hence, in all con-
plot the run which obtained the lowest objective          vergence figures, we observe stair-like curves of
function convergence. The complete set of all             width n + 1.
ten runs for all of these simulations are plotted            Figure 5 illustrates the convergence results for
in the appendix and their convergence success             the hydrogen square lattice at an interatomic
rates are summarized in Table 1 for complete-             distance of 1.23 Angstroms. We can see from
ness. When using the “zero vector” initializa-            these figures that similarly to H2, the difference
tion, we run each algorithm only once. Run-               between SSVQE and qOMM is primarily in the
ning the algorithm multiple times for this ini-           number of UCCSD repetitions needed to con-
tialization is neither necessary nor useful in the        verge. When calculating 2 states, qOMM re-
absence of noise because the outcome is deter-            quires 2-UCCSD whereas SSVQE requires 3-
ministic for a given initial point.                       UCCSD. When calculating 3 states, qOMM
                                                          still requires only 2-UCCSD, whereas SSVQE
4.1    H2 and Hydrogen square                             requires 4-UCCSD. These observations are fur-
                                                          ther tabulated in Table 1.
We begin by presenting the results for the 4
and 8 qubit systems we consider here: H2 and
the hydrogen square model. Figure 4 illustrates
                                                          4.2    Noisy H2 simulations
the convergence of the objective function for             We now run H2 simulations in the presence of
each algorithm when attempting to calculate               a classically simulated noise model on Qiskit’s
the first three energy levels of the H2 molecule           AerSimulator. Each one-qubit gate in the cir-
at the equilibrium bond distance.                         cuit is modelled as being accompanied by a lo-
  Figure 4b demonstrates how qOMM and                     cal depolarizing channel with some probability
SSVQE both converge to the global minimum,                of error perror . Two-qubit gates are accompa-
but notably, SSVQE cannot do so with just                 nied by a tensor product of two local depolar-
1-repetition UCCSD, requiring two repetitions             izing channels acting on each qubit. No error
in order to converge to within a relative accu-           mitigation strategies are employed. We use the
racy of 10−5 . The success rate is further re-            Parity mapping in order to use symmetry con-
ported in Table 1. Figure 4a illustrates an at-           siderations to reduce the H2 Hamiltonian to a 2
tempt to solve the same problem, except the               qubit representation. 35 We construct ansatzes
ansatz parameters are all initialized to zero.            for each algorithm using the circuit block pat-
In this instance, we see that qOMM converges              tern shown in Figure 6.
more quickly by a factor of roughly 5 compared
to its random initialization counterpart, while

                                                     11
objective function conve gence fo 3 states                                                           objective function conve gence fo 3 states
                                                                            qOMM (1-UCCSD)                          100                                                              qOMM (1-UCCSD)
                100                                                         SSVQE (3-UCCSD)                                                                                          SSVQE (3-UCCSD)
                                                                            SSVQE (2-UCCSD)                                                                                          SSVQE (2-UCCSD)
                                                                            SSVQE (1-UCCSD)                                                                                          SSVQE (1-UCCSD)
               10−1                                                                                                10−1

                                                                                                                   10−2
 elative e o

                                                                                                  elative e o
               10−2

               10−3                                                                                                10−3

               10−4                                                                                                10−4

               10−5                                                                                                10−5
                          0          10           20          30            40         50                                     0        50          100         150        200       250           300
                                      objective function evaluation ite ation                                                               objective function evaluation ite ation

                              (a) Zero vector initialization                                                                         (b) Random initialization
                                                                                                                              |fi −fexact |
Figure 4: Convergence (noise-free) of the relative error                                                                        |fexact |
                                                                                                                                               of qOMM and SSVQE for 3 H2
states at an interatomic distance of 0.735 Å.

                              objective function conve gence fo 2 states                                                             bjective functi n c nvergence f r 3 states
                100                                                         qOMM (2-UCCSD)                                                                                           qOMM (2-UCCSD)
                                                                            qOMM (1-UCCSD)                                                                                           qOMM (1-UCCSD)
                                                                            SSVQE (2-UCCSD)                                                                                          SSVQE (3-UCCSD)
                                                                            SSVQE (3-UCCSD)                                                                                          SSVQE (4-UCCSD)
               10−1                                                                                                10−1

               10−2                                                                                                10−2
                                                                                                  relative err r
 elative e o

               10−3                                                                                                10−3

               10−4                                                                                                10−4

               10−5                                                                                                10−5
                      0       5000        10000       15000       20000       25000      30000                            0          20000        40000           60000          80000           100000
                                      objective function evaluation ite ation                                                            objective functi   n evaluati n iterati n

                                           (a) Two states                                                                                      (b) Three states

Figure 5: Convergence (noise-free) of the relative error |fi|f−f exact |
                                                               exact |
                                                                         of qOMM and SSVQE for the
hydrogen square model at an interatomic distance of 1.23 Å using a random parameter initialization.

         Ry (θ0 )               Rz (θ2 )            •        Ry (θ4 )            Rz (θ6 )                 used to compile all circuits to the set  √ of ba-
                                                                                                          sis gets consisting of Rz , Ry , CNOT, X, and
         Ry (θ1 )               Rz (θ3 )                     Ry (θ5 )            Rz (θ7 )                 the identity. For all problem instances, we use
                                                                                                          the minimum number of repetitions necessary
Figure 6: Base circuit block pattern used for                                                             to converge in the absence of noise in order to
noisy H2 simulations.                                                                                     ensure that we are not simply measuring the
                                                                                                          ability of the ansatz to represent the global min-
  This pattern can be repeated an arbitrary                                                               imum. This corresponds to one repetition for
number of times to construct increasingly ex-                                                             qOMM and two repetitions for SSVQE. We use
pressive ansatzes in the same way that the num-                                                           COBYLA as the classical optimization subrou-
ber of UCCSD repetitions was varied in the                                                                tine. This optimizer is more suitable for noisy
noiseless simulations. The Qiskit compiler is                                                             simulations than L-BFGS-B due to its lack of

                                                                                                 12
objective function convergence for 2 states                                                  objective function convergence for 3 states
                                                                                 qOMM                                                                                       qOMM
                                                                                 SSVQE                                                                                      SSVQE

                  10−1                                                                                         10−1
 relative error

                                                                                              relative error
                                                                                                               10−2
                  10−2

                         0        50            100           150          200                                        0    50       100        150       200        250   300
                                   objective function evaluation iteration                                                      objective function evaluation iteration

                                       (a) Two states                                                                              (b) Three states

Figure 7: Convergence of the relative error |fi|f−f exact |
                                                  exact |
                                                            of qOMM and SSVQE for H2 at an interatomic
distance of 0.735 Å, where each circuit gate is modelled as having a probability of local depolarizing
error perror = 0.001.

a need to calculate gradient information. 106                                                        of 10−5 much more quickly than their randomly
circuit samples are used to evaluate all inner                                                       initialized counterparts. Notably, qOMM re-
product terms and expectation values. perror is                                                      quires only 1-repetition UCCSD to achieve this
set to 10−3 . Both SSVQE and qOMM are run                                                            convergence, while SSVQE requires two repeti-
ten times with randomly initialized parameters.                                                      tions. When three states are calculated, qOMM
The convergence of the runs which achieved                                                           can achieve a convergence below 10−5 with 2-
the lowest objective function values are given                                                       repetition UCCSD, whereas SSVQE requires
in Figure 7. The results for all ten runs are                                                        three repetitions. From Figure 8b we can see
given in Appendix A. We can see from these                                                           that when the ansatz parameters are all initial-
figures that both SSVQE and qOMM demon-                                                               ized to zero, both algorithms quickly converge
strate a robustness to noise, although they do                                                       to the global minimum. Notably, qOMM re-
not achieve the same accuracy as the noiseless                                                       quires only 1-UCCSD repetition with this ini-
results.                                                                                             tialization. When attempting to find larger
                                                                                                     numbers of states, we can see from Table 1
4.3                      LiH                                                                         that qOMM can converge with 2-UCCSD for all
                                                                                                     numbers of states considered, whereas SSVQE
We now present the results for the 10-qubit                                                          requires an increasing number of ansatz repeti-
system we consider here: LiH. Figure 8 illus-                                                        tions to find increasing numbers of states. In
trates the convergence of the objective func-                                                        the following, we discuss the numerical results
tion for both algorithms with various UCCSD                                                          on LiH mainly from three perspectives: conver-
ansatzes when calculating up to the first three                                                       gence plateau, initialization, and ansatz circuit
energy levels. We can see from Figure 8c that                                                        depth.
when the ansatz parameters are randomly ini-                                                           Convergence Plateau. In many of the LiH
tialized, both algorithms demonstrate the abil-                                                      convergence plots we show here, both qOMM
ity to converge within an accuracy of 10−5 , but                                                     and SSVQE initially converge at a rapid rate
require 2-repetition UCCSD to do so. From                                                            but then hit a plateau and stall the conver-
Figure 8a we can see that when all of the                                                            gence until they escape from the “bad” region
parameters are initialized to zero, both algo-                                                       of the energy landscape. As long as the algo-
rithms converge to within a relative accuracy                                                        rithm escapes from the “bad” region, the con-

                                                                                         13
objective function convergence for 2 states                                                         objective function convergence for 3 states
                   100                                                          OMM (1-UCCSD)                         100                                                    OMM (1-UCCSD)
                                                                               SSVQE (2-UCCSD)                                                                              SSVQE (3-UCCSD)
                                                                               SSVQE (1-UCCSD)                                                                              SSVQE (2-UCCSD)
                                                                                                                                                                            SSVQE (1-UCCSD)
                  10−1                                                                                               10−1

                  10−2                                                                                               10−2
 relative error

                                                                                                    relative error
                  10−3                                                                                               10−3

                  10−4                                                                                               10−4

                  10−5                                                                                               10−5
                         0     1000       2000       3000       4000       5000             6000                             0   2500     5000 7500 10000 12500 15000 17500
                                   objective function evaluation iteration                                                              objective function evaluation iteration

                     (a) Two states (zero vector initialization)                                                       (b) Three states (zero vector initialization)

                              bjective functi n c nvergence f r 2 states                                                         objective function conve gence fo 3 states
                   100                                                         qOMM (2-UCCSD)                         100                                                   qOMM (2-UCCSD)
                                                                               qOMM (1-UCCSD)                                                                               qOMM (1-UCCSD)
                                                                               SSVQE (2-UCCSD)                                                                              SSVQE (3-UCCSD)
                                                                               SSVQE (1-UCCSD)                                                                              SSVQE (2-UCCSD)
                  10−1                                                                                               10−1
 relative err r

                  10−2                                                                                               10−2
                                                                                                    elative e o

                  10−3                                                                                               10−3

                  10−4                                                                                               10−4

                  10−5                                                                                               10−5
                         0        20000           40000            60000              80000                                  0      50000        100000       150000          200000
                                   objective functi n evaluati   n iterati n                                                          objective function evaluation ite ation

                         (c) Two states (random initialization)                                                             (d) Three states (random initialization)
                                                                                                         |fi −fexact |
                         Figure 8: Convergence of the relative error                                       |fexact |
                                                                                                                                 of qOMM and SSVQE for LiH.

vergence rate resumes being rapid. These “bad”                                                              by itself, is non-convex but has no spurious lo-
regions of the energy landscape (regardless of                                                              cal minima. 27,36,37 When the parameterization
what their nature may be) seem to be the main                                                               of the ansatz circuits is taken into considera-
obstacle for the convergence of both algorithms,                                                            tion, the energy landscape of qOMM is non-
which is more severe to SSVQE. Studying this                                                                convex and could have spurious local minima.
apparent feature of the energy landscape may                                                                For non-convex energy landscapes, usual opti-
provide some insight as to how both algorithms                                                              mizers, including L-BFGS-B, are efficient in a
may be improved and what limitations are in-                                                                neighborhood of local minima whereas, around
herent to the construction of the cost functions.                                                           strict saddle points, without second-order in-
From a numerical linear algebra point of view,                                                              formation, the optimizers could stall there for
such a convergence behavior is not very sur-                                                                a long time. From a practical point of view,
prising. The objective function in SSVQE is                                                                 we can further investigate the convergence of
convex, while with the consideration of the or-                                                             each eigenvalue. Since the issue is more se-
thogonality constraint and the parameteriza-                                                                vere for SSVQE, we focus on the SSVQE rather
tion of the ansatz circuits, the energy land-                                                               than qOMM. The construction of the SSVQE
scape of SSVQE becomes non-convex. On the                                                                   objective function allows us to obtain the esti-
other hand, the objective function of qOMM,                                                                 mated values for each eigenvalue at every func-

                                                                                                   14
SSVQE eigenval e convergence for 3 states                                                                  SSVQE eigenvalue convergence for 3 states
                     −0.80                                                                                                        100
                                                                           exact val es                                                                                                      E0
                                                                           E0                                                                                                                E1
                                                                           E1                                                                                                                E2
                     −0.85                                                 E2
                                                                                                                                 10−2

                                                                                               energy levels relative accuracy
                     −0.90
energy levels (Ha)

                                                                                                                                 10−4
                     −0.95

                     −1.00                                                                                                       10−6

                     −1.05
                                                                                                                                 10−8

                     −1.10
                             0   50000 100000 150000 200000 250000 300000 350000 400000                                                 0   50000 100000 150000 200000 250000 300000 350000 400000
                                        objective f nction eval ation iteration                                                                    objective function evaluation iteration

Figure 9: Convergence of the first three energy                                                 Figure 10: Relative accuracy n,i   n,exact
                                                                                                                                          of the
                                                                                                                                                                           |E    −E          |
                                                                                                                             |En,exact |
levels of LiH for SSVQE using a random param-                                                  convergence of the first three energy levels of
eter initialization.                                                                           LiH for SSVQE using a random parameter ini-
                                                                                               tialization.
tion evaluation (in addition to the objective
function as a whole) at a negligible additional                                                rect impact on all the expectation value terms
computational cost. This allows us to gain some                                                simultaneously. Second, the energy levels con-
intuition as to why the SSVQE objective func-                                                  tribute unequally to the overall cost function,
tion convergence is observed to plateau in many                                                with the lower energy levels contributing more
of the LiH runs before steeply converging below                                                than the higher energy levels, an effect that be-
a relative accuracy of 10−5 . In Figure 9 and Fig-                                             comes more pronounced due to the presence of
ure 10, we plot the eigenvalues and their rela-                                                the weight vector. Because qOMM enforces or-
tive accuracies, respectively, for one of the ten                                              thogonality (at the global minimum) through
randomly-initialized SSVQE runs. We can see                                                    its overlap terms, the energy expectation value
from Figure 9 that while E0 converges rapidly,                                                 terms are associated with different independent
the convergence of E1 and E2 plateau and do                                                    parameter values. This means that changes in
not converge for many more function evalua-                                                    parameters that directly affect one of the energy
tions. From Figure 10 we can see that E1 does                                                  expectation value terms in Eq. (1) do not di-
not escape from its plateau until the accuracy of                                              rectly change the others. They only change the
E0 is sufficiently decreased. This is followed by                                                overlap terms. This affords qOMM a flexibility
the accuracies of E0 and E1 increasing together.                                               that allows it to spend much fewer iterations es-
E2 does not escape from its plateau until the ac-                                              caping from the “bad” landscape regions. From
curacies of E0 and E1 are both sufficiently de-                                                  Figure 8b and Figure 8a we see that starting
creased, after which the accuracies of all three                                               with a good initial guess can reduce this effect,
energy levels collectively increase, allowing the                                              resulting in significantly improved convergence.
objective function to converge to its global min-                                                We also briefly note that although we have
imum. While we only illustrate this for one                                                    chosen to measure convergence speed in terms
of the ten runs, we observe that this qualita-                                                 of the number of calls to the objective function,
tive behavior is typical for the other nine runs.                                              another valid measure of convergence speed
Besides the non-convexity reason from objec-                                                   would be the total number of optimization it-
tive functions, other reasons for this behavior                                                erations if it can be assumed that one can par-
could be related to two aspects of the weighted                                                allelize multiple calls to the objective function.
sum nature of the SSVQE algorithm. First, the                                                  This is because one optimization iteration can
same ansatz circuit is applied to all input states,                                            require multiple calls to the objective function.
meaning any change in the parameters has a di-                                                 In the noiseless results we have presented here,

                                                                                          15
each optimization iteration of the L-BFGS-B                                                        starting from random parameter initializations
optimizer requires n+1 calls to the objective                                                      sometimes cannot find the global minima, as
function for n total parameters. For problem                                                       can be seen from Table 1. Overall, in all re-
instances we have presented here where qOMM                                                        sults included in this paper, zero vector param-
has more parameters than SSVQE, the speedup                                                        eter initializations outperform their random
of qOMM over SSVQE would be even greater                                                           parameter initialization counterparts for both
by this measure. For instance, in Figure 8d,                                                       algorithms. However, we still include numeri-
qOMM with 2-UCCSD has 144 total parame-                                                            cal results for random parameter initializations
ters and SSVQE with 3-UCCSD has 72.                                                                to demonstrate the different performance be-
                                                                                                   tween qOMM and SSVQE. This is because the
                                       VQE convergence
                                                                                                   zero vector initializations initialize the states
                                                           VQE (zero vector initialization)        to be excitations from Hartree-Fock states and
                   100                                     VQE (random initialization)
                                                                                                   only work for a limited number of low-lying
                  10−1
                                                                                                   states. If more excited states are needed, the
                                                                                                   zero vector initialization could even be a bad
 relative error

                  10−2                                                                             initialization, which is due to the fact that
                                                                                                   the excitation energies from Hartree-Fock may
                  10−3                                                                             not be in the consistent order with that of
                                                                                                   FCI excitation energies. This "zero vector"
                  10−4
                                                                                                   initialization is demonstrated primarily not to
                  10−5
                                                                                                   advocate for its use as a scalable initialization
                         100   101            102             103
                               objective function evaluation iteration
                                                                                104
                                                                                                   strategy, but to illustrate what the improved
                                                                                                   convergence could look like if it can be assumed
Figure 11: Convergence of the relative error                                                       that one has a reasonable initialization strat-
|E0,i −E0,exact |
   |E0,exact |
                  of VQE for the ground state prob-                                                egy. The flexibility of variational ansatzes in
lem of LiH using a random parameter initial-                                                       qOMM would have another advantage when
ization                                                                                            many excited states are needed. In qOMM,
                                                                                                   we could use zero vector parameter initializa-
  Initialization. Initialization plays an impor-                                                   tion for a few states and random parameter
tant role in the convergence for both qOMM                                                         initialization for the remaining states without
and SSVQE. Even for the corresponding                                                              known good initializations. Such a mixed ini-
VQE ground state problem with 1-repetition                                                         tialization strategy could avoid being trapped
UCCSD, as in Figure 11, random parameter                                                           in bad local minima and benefit from the fast
initialization on average converges in about                                                       convergence of those states with zero vector pa-
100 times more objective function evaluations                                                      rameter initialization. SSVQE, unfortunately,
than the zero vector parameter initialization.                                                     cannot use such a mixed initialization strat-
Both initializations in Figure 11 converge to a                                                    egy due to the implicit orthogonality among
relative error 10−5 . In excited state comput-                                                     initialized states. This motivates further in-
ing, the two different initialization strategies                                                    vestigation into developing and benchmarking
for both qOMM and SSVQE not only differ in                                                          initialization strategies more sophisticated than
the number of iterations but also differ in the                                                     the ones we consider in this work. For exam-
convergence results. Comparing Figures 8a and                                                      ple, the MCVQE paper 24 proposes an efficient
8c, for SSVQE with 2-repetition UCCSD, ran-                                                        quantum circuit implementation of CIS states
dom parameter initialization converges more                                                        for excited state initializations. This initial-
than ten times slower than that of zero vector                                                     ization would likely improve the convergence
parameter initialization. A similar result can                                                     results of the simulations for both qOMM and
be observed in Figures 8b and 8d for SSVQE                                                         SSVQE presented in this paper. Furthermore,
with 3-repetition UCCSD. Besides the differ-                                                        because the optimization stage of MCVQE can
ence in the number of iterations, optimization                                                     be seen as a special case of SSVQE that uses a

                                                                                              16
CIS initialization and an equal weighting of the          impact of ansatz circuit depth relates to opti-
states, performing analogous simulations using            mization. Another field full of non-convex op-
a CIS initialization would offer insight into how          timization and parametrized ansatz is the deep
the convergence of qOMM compares to that of               neural network. An interesting theoretical re-
MCVQE. The approach of using chemically-                  sult 38 therein shows that enlarging the param-
motivated initializations inspired from classical         eter space, i.e., increasing the number of pa-
computational chemistry is likely to further              rameters, would make the optimization prob-
improve the convergence of algorithms such as             lem easier, i.e., the energy landscape would be
qOMM and SSVQE. The main challenge in this                full of the approximated global minimum. Al-
approach is that high level chemically-inspired           though we have not established similar results
initializations require efficient low-level circuit         for quantum ansatz circuits, we observe the be-
implementation, which is highly non-trivial.              havior from our numerical results. In Figure 8b,
  Ansatz Circuit Depth. Based on our nu-                  1-repetition UCCSD is able to characterize the
merical results, increasing the ansatz circuit            three states for LiH using qOMM. However, all
depth, i.e., increasing the repetition in UCCSD,          ten random parameter initializations got stuck
has two major impacts. First, the expressive-             as can be seen in Table 1. As we increase the
ness increases as the ansatz circuit depth in-            circuit depth from 1-repetition to 2-repetition
creases. In the ground state VQE, as in Fig-              UCCSD, qOMM rapidly converges to a global
ure 11, 1-repetition UCCSD is sufficient for                minimum for all ten random parameter initial-
the ansatz to approximate the ground state to             izations.
chemical accuracy. When two states of LiH                   We conclude this section by discussing the
are needed, 1-repetition UCCSD, as in Fig-                circuit depth involved in calculating the inner
ure 8, is not able to simultaneously approx-              product terms in the qOMM objective function.
imate the ground state and the first excited               Currently, the only method we are aware of that
states, whereas 2-repetition UCCSD is able to.            involves running the circuit in Figure 1. At
From the qOMM results in Figure 8a, we know               a minimum, the inner product circuit will in-
that 1-repetition UCCSD is able to approxi-               cur a circuit depth twice that of the original
mate the ground state and the first excited                ansatz. There will be additional circuit depth
state. When three states are needed, as in Fig-           due to the need to compile controlled versions
ure 8b, we need 3-repetition UCCSD to simul-              of the ansatz to the finite basis gate set of the
taneously approximate three states. Again, 1-             machine. The extent of this additional depth
repetition UCCSD is still able to approximate             compared to the original ansatz will depend on
any of these three states. More generally, 2-             a number of factors including the basis gate
UCCSD is sufficient for all LiH cases studied               set and qubit connectivity of the hardware, the
when using qOMM, whereas SSVQE requires                   choice of ansatz, and the efficiency of the com-
n-UCCSD in order to reliably find n states.                pilation algorithm used. We can give some in-
The results for the hydrogen square model in              tuition for the circuit depths involved for the
Figure 5 demonstrates a similar pattern, ex-              particular systems we studied here for a partic-
cept the situation is more severe for SSVQE,              ular basis gate set. Using the Qiskit compiler,
which requires more than n-UCCSD for finding               the gate depth for the 1-UCCSD ansatz used
n states. A more detailed analytical descrip-             in our LiH simulations is approximately 1400
tion of the ability of the UCCSD ansatz to si-            gates when compiled
                                                                         √        to the basis gate set con-
multaneously represent an increasing number of            sisting of Rz , X, X, CNOT, and the identity.
excited states as the number of repetitions is in-        The corresponding circuit depth for computing
creased is not yet known. qOMM could adopt                the inner product terms is approximately 19600
many UCCSDs, UCCSD with many repetitions,                 gates. These counts would be scaled roughly by
or a mixture of them as its variational ansatz.           a factor of n for n-UCCSD. This underscores
SSVQE, however, could only benefit from in-                the importance of future work that addresses
creasing the repetitions in UCCSD. The second             how to reduce these gate counts. In partic-

                                                     17
You can also read