A Simple and Fast Coordinate-Descent Augmented-Lagrangian Solver for Model Predictive Control

A Simple and Fast Coordinate-Descent Augmented-Lagrangian Solver
                                                               for Model Predictive Control
                                                                                               Liang Wu1 , Alberto Bemporad1

                                              Abstract— This paper proposes a novel Coordinate-Descent                problem construction and other operations on the problem
                                           Augmented-Lagrangian (CDAL) solver for linear, possibly                    matrices can be done off-line, for linear parameter-varying
                                           parameter-varying, model predictive control problems. At each              (LPV) MPC problems, or for LTI problems in which the
                                           iteration, an augmented Lagrangian (AL) subproblem is solved
arXiv:2109.10205v1 [math.OC] 21 Sep 2021

                                           by coordinate descent (CD), whose computation cost depends                 cost function and/or constraint matrices change at run time,
                                           linearly on the prediction horizon and quadratically on the state          constructing the QP problem explicitly online increases the
                                           and input dimensions. CDAL is simple to implement and does                 complexity of the embedded code. To avoid such a QP con-
                                           not require constructing explicitly the matrices of the quadratic          struction phase, in [15] the authors proposed a construction-
                                           programming problem to solve. To favor convergence speed,                  free solver for linear MPC based on an active-set method,
                                           CDAL employs a reverse cyclic rule for the CD method, the
                                           accelerated Nesterov’s scheme for updating the dual variables,             whose implementation however requires more complex op-
                                           and a simple diagonal preconditioner. We show that CDAL                    erations than those involved in first-order algorithms.
                                           competes with other state-of-the-art first-order methods, both
                                           in case of unstable linear time-invariant and prediction models            A. Contribution
                                           linearized at runtime. All numerical results are obtained from                This paper describes a novel construction-free MPC solver
                                           a very compact, library-free, C implementation of the proposed
                                           CDAL solver.                                                               for both LTI and LPV MPC problems that is very simple
                                              Index Terms— Augmented Lagrangian method, coordinate                    to code and computationally efficient. The solver is based
                                           descent method, model predictive control                                   on combining coordinate descent (CD) and augmented La-
                                                                                                                      grangian (AL) methods.
                                                                   I. I NTRODUCTION                                      Coordinate descent has received extensive attention in
                                                                                                                      recent years due to its application to machine learning [16]–
                                              Model predictive control (MPC) has been widely used for
                                                                                                                      [18] and other applications [19]. Random and accelerated
                                           decades to control multivariable systems subject to input and
                                                                                                                      variants of the CD method were proposed in [20]–[23]. In
                                           output constraints [1]. Apart from small-scale linear time-
                                                                                                                      this paper, we will exploit the special structure arising from
                                           invariant (LTI) MPC problems whose explicit MPC control
                                                                                                                      linear MPC formulations when applying CD.
                                           law can be obtained [2], deploying an MPC controller in
                                                                                                                         In [24]–[26], the authors also use AL to solve linear
                                           an electronic control unit requires an embedded Quadratic
                                                                                                                      MPC problems with input and state constraints using the fast
                                           Programming (QP) solver. In the past decades, the MPC
                                                                                                                      gradient method [27] to solve the associated subproblems.
                                           community has made tremendous research efforts to develop
                                                                                                                      The Lipschitz constant of the cost gradient and convexity
                                           embedded QP algorithms [3], based on interior-point meth-
                                                                                                                      parameters [24] are needed to achieve convergence, and
                                           ods [4], [5], active-set algorithms [6], [7], gradient projection
                                                                                                                      computing them requires in turn the Hessian matrix of the
                                           methods [8], the alternating direction method of multiplier
                                                                                                                      subproblem and hence constructing the QP problem. As the
                                           (ADMM) [9], [10], and other techniques [11]–[14].
                                                                                                                      Hessian matrix of the AL subproblem is close to a block
                                              A demanding requirement for industrial MPC applications
                                                                                                                      diagonal matrix, this suggests the use of the CD method to
                                           is code simplicity, especially in safety-critical applications in
                                                                                                                      solve such a QP subproblem, due to the fact that CD does not
                                           which the code must be verified and validated and is easy
                                                                                                                      require any problem-related parameter. Moreover, only small
                                           to maintain. In this respect, compared to interior-point and
                                                                                                                      matrices are involved in running the CD method, namely
                                           active-set methods that may require linear algebra operations,
                                                                                                                      the matrices of the linear prediction model and the weight
                                           first-order methods like gradient projection and ADMM
                                                                                                                      matrices. As a result, the proposed CDAL algorithm does not
                                           are quite appealing, due to their very simple embedded
                                                                                                                      require the QP construction phase and is extremely simple
                                           implementation. Most of the proposed approaches, however,
                                                                                                                      to implement. In addition, each update of the optimization
                                           require that the matrices of the QP problem are constructed
                                                                                                                      vector has a computation cost per iteration that is quadratic
                                           explicitly (for instance by condensing the MPC problem
                                                                                                                      with the state and input dimensions and linear with the
                                           through the elimination of equality constraints) in order to
                                                                                                                      prediction horizon, which makes the proposed algorithm
                                           be consumed by the solver, typically for preconditioning, to
                                                                                                                      attractive for long prediction horizons.
                                           estimate the Lipschitz constant of the cost gradient, matrix
                                                                                                                         To improve the convergence speed of CDAL, three tech-
                                           factorizations, and during solver iterations. While this may
                                                                                                                      niques are proposed in this paper: a reverse cyclic rule
                                           not be an issue for LTI-MPC problems, in which the QP
                                                                                                                      for CD, Nesterov’s acceleration [27], and preconditioning.
                                             The authors are with the IMT School for Advanced Studies Lucca, Italy,   While the use of a reverse cyclic rule in CD still preserves
                                           {liang.wu,alberto.bemporad}@imtlucca.it                                    convergence, when the MPC problem is solved by warm-
starting it from the shifted previous optimal solution, the          horizon, linear equality constraints or box constraints on the
gap between the initial guess and the new optimal solution is        terminal state xT for guaranteed closed-loop convergence, as
mainly caused by the last block of variables, and computing          well as affine prediction models. To simplify the notation, in
the last block at the beginning tends to reduce the overall          the sequel we consider the following reformulation of (1)
number of required iterations to converge, as we will verify                      T
in the numerical experiments reported in this paper. Since           min                x̂0t (Ĉ 0 Ŵ Ĉ)x̂t − x̂0t (Ĉ 0 Ŵ r̂t ) + û0t−1 W∆u ût−1
the general AL method has O(1/k) convergence rate [28],                           t=1
we employ Nesterov’s acceleration scheme for updating the              s.t.       x̂t+1 = Âxt + B̂ ût
dual vector to improve computation speed, although the
                                                                                  x̂min ≤ x̂t ≤ x̂max , t = 1, . . . , T
accelerated scheme does not reach the expected O(1/k 2 )
convergence rate due to the limit imposed on the number of                      ûmin ≤ ût ≤ ûmax , t = 0, . . . , T − 1             (2)
iterations of the inner CD loop. To further reduce the outer                                                       A  B
                                                                     where x̂j = uj−1 , ûj = ∆uj ,h = [ 0iI ] ∈ Rh i, B̂ =n×n
loop iterations, this paper proposed a heuristic preconditioner                                                                rj
                                                                     [ BI ] ∈ Rn×p , Ĉ = [ C0 I0 ], Ŵ = W0y W0u , r̂j = urj . The
that simply scales the state variables.
                                                                     vector z of variables to optimize is
   The structure of the paper is as follows. Section II formu-                                                       0
                                                                          z = û00 x̂01 û01 . . . û0T −1 x̂0T         ∈ RT (nx +nu )
lates the MPC problem we want to solve. In Section III, we
present the augmented Lagrangian and coordinate descent
                                                                     The inequality constraints on state and input variables, whose
methods we use to formulate the solver, together with the
                                                                     number is 2N (nx + nu ), are
reverse cyclic rule, Nesterov’s acceleration scheme, and                             
preconditioner. In Section IV, two numerical benchmark                                  x̂min ≤ x̂t ≤ x̂max , ∀t = 1, . . . , T
                                                                      z ≤ z ≤ z̄ ⇔
examples are presented, namely the ill-conditioned AFTI-                                ûmin ≤ ût ≤ ûmax , ∀t = 0, . . . , T − 1
16 linear MPC example, and the continuously stirred tank             At each sample step, the MPC problem (1) can be recast as
reactor (CSTR) MPC example, in which the nonlinear dy-               the following quadratic program (QP)
namics is linearized at each controller execution. Finally, we                                   1 0
draw conclusions in Section V.                                                         min         z Hz + h0 z
B. Notation                                                                             s.t.     z ≤ z ≤ z̄
   H  0 (H  0) denotes positive definiteness (semi-                                                          Gz = g                                       (3)
definiteness) of a square matrix H, H 0 (or z 0 ) denotes the        where H = H  0, H ∈ R0                            nz ×nz
                                                                                                           , nz = T (nx + nu ),
transpose of matrix H (or vector z), Hi,j denotes the ith            h ∈ Rnz , G ∈ RT nx ×nz , and g ∈ RT nx are defined as
row and the jth column element of matrix H, Hi,· , H·,j                          R 0 ... 0 0 
denote the ith row vector, and jth column vector of matrix                              0     Q     ...       0        0          R     =    W∆u
H, respectively. For a vector z, kzk2 denotes the Euclidean             H     =
                                                                                        ..    ..    ..        ..       ..
                                                                                         .     .       .       .        .
norm of z, z6=i the subvector obtained from z by eliminating                                                                
                                                                                                                                   Q =         Ĉ 0 Ŵ Ĉ
                                                                                         0     0     ...       R        0
its ith component zi .                                                                   0     0     ...       0        Q
                                                                                                                                                 
            II. M ODEL P REDICITIVE C ONTROL                                          B̂                 0      0        ...      0     0      0
  Consider the following MPC formulation for tracking                                0        Â        B̂    −I        ...      0     0      0 
                                                                        G =          .
                                                                                     .         ..        ..    ..       ..        ..    ..    .. 
problems                                                                               .         .         .     .          .       .     .     .
         T −1                                                                         0         0        0      0        ...      Â    B̂    −I
         X                         2                            2
                kWy (yt+1 − rt+1 )k2 + Wu ut+1 − urt+1
min                                                                                   Ĉ 0 Ŵ r̂
                                                                                                                                
                                                                2                                         −Ax̂k
         t=0                                                                         Ĉ 0 Ŵ r̂        0 
                        + kW∆u ∆ut k2
                                       2                                 h =              ..    , g =    ..  
                                                                                            .                .
                                                                                                             
 s.t.    xt+1 = Axt + But−1 + B∆ut                                                       0
                                                                                      Ĉ Ŵ r̂             0
         yt = Cxt                                                    Clearly matrix G is full row rank. Note that
         ut = ut−1 + ∆ut                                             A, B, C, Wy , Wu , W∆u and the upper and lower bounds
         xmin ≤ xt ≤ xmax , t = 1, . . . , T                         on x, u, and ∆u in (1) may change at each controller
         umin ≤ ut ≤ umax , t = 0, . . . , T − 1
                                                                                         III. A LGORITHM
         ∆umin ≤ ∆ut ≤ ∆umax , t = 0, . . . , T − 1            (1)
                                                                     A. Augmented Lagrangian Method
in which rt and urt are the output and input set-points, xt ∈
                                                                        We solve the convex quadratic programming problem (3)
Rnx the state vector, ut ∈ Rnu the input vector, ∆ut = ut −
                                                                     by applying the augmented Lagrangian method. The bound-
ut−1 the vector of input increments, and yt ∈ Rny the output
                                                                     constrained Lagrangian function L : Z × Rnz → R is given
vector. We assume that Wy = Wy0  0, Wu = Wu0  0,
           0                                                         by
W∆u = W∆u      0. The formulation (1) could be extended to                               1
include time-varying bounds on x and u along the prediction                     L(z, λ) = z 0 Hz + z 0 h + λ0 (Gz − g)
where Z = {z ≤ z ≤ z̄} and λ ∈ RT nx is the vector of             Algorithm 1 Accelerated augmented Lagrangian method
Lagrange multipliers associated with the equality constraints     [28]
in (3). The dual problem of (3) is                                Input: Initial guess z 0 ∈ Z and λ−1 = λ0 ; maximum
                                                                  number Nout of iterations.
                             max d(λ)                       (4)    1. Set α−1 , α0 ← 1; λ̂1 ← λ0 ;
                            λ∈RT nx
                                                                   2. for k = 1, 2, · · · , Nout do
where d(λ) = minz∈Z L(z, λ). Assuming that Slater’s con-            2.1. z k ← argminz∈Z Lρ (z, λ̂k );
straint qualification holds, the optimal solution of the primal     2.2. λk ← λ̂k + ρ(Gz k − g);
problem (3) and of its dual (4) coincide. However, d(λ) is not      2.3. if kλk − qλ̂k k22 ≤ , stop;
                                                                                1        4         2     2
differentiable in general [29], so that any subgradient method      2.4. αk ← 2 ( αk−1       + 4αk−1  − αk−1 );
for solving (4) would have a slow convergence rate. Under                       αk
                                                                   2.5. β ← αk−1 − αk ;
the AL framework, the augmented Lagrangian function                 2.6. λ̂k+1 ← λk + β(λk − λk−1 );
                                                                   3. end.
               1 0                           ρ
 Lρ (z, λ) =     z Hz + z 0 h + λ0 (Gz − g) + kGz − gk2 (5)
               2                             2
                                                                    For box-constrained QP deriving from (1), the correspond-
is used instead, where the parameter ρ > 0 is a penalty           ing subproblem (7a) in the AL method is
parameter. The corresponding augmented dual problem is
                                                                                     z k+1 = arg min F (z; λk )
defined as:                                                                                           z                                       (8)
                                                                                          s.t. z ≤ z ≤ z̄
                            max dρ (λ)                      (6)
                            λ∈Rm                                  where
                                                                                                  1 0
                                                                                   F (z; λk ) =     z HA z + (hkA )0 z
where dρ (λ) = minz∈Z Lρ (z, λ). The dual problem (4)                                             2
and the augmented dual problem (6) share the same optimal         hkA = ρ1 h + G0 λk − G0 g, and HA           =      1
                                                                                                                            + G0 G has the
solution [30], and most important dρ (λ) is concave and           block-sparse structure
differentiable, with gradient [29], [31]
                                                                             φ φ        0   0   0            ...     0       0     0     
                                                                                1      2
                    ∇dρ (λ) = Gz ∗ (λ) − g                                   φ2 φ30 φ4 φ5 0                  ...     0       0     0     
                                                                             0 φ4 φ1 φ2 0                    ...     0       0     0     
                                                                             0 φ05 φ02 φ3 φ4                 ...     0       0     0
                                                                                                                                         
                                                                     HA =   ..     ..  ..   .. ..           ..       ..      ..    ..   
where z ∗ (λ) denotes the optimal solution of the inner prob-                .       .   .    .  .              .      .       .     .
lem minz∈Z Lρ (z, λ) for a given λ. Moreover, the gradient                   0      0   0   0   0            ...     φ3      φ4    φ5    
                                                                                                                                         
mapping ∇dρ : Rm → Rm is Lipschitz continuous, with                            0      0    0      0       0   ...             φ1    φ2
Lipschitz constant Ld = ρ−1 . The AL algorithm can be                          0      0    0      0       0   ...     φ05     φ02   φ6
expressed as follows [30]                                         where
                                                                            φ1 = ρ1 R + B 0 B,       φ2 = −B 0
                z k+1   =    argmin Lρ (z, λk )           (7a)              φ3 = ρ1 Q + (I + A0 A) , φ4 = A0 B
               λk+1     = λk + ρ(Gz k+1 − g)              (7b)              φ5 = −A0 ,               φ6 = ρ1 Q + I

                                                                  Since G is full rank, matrix HA  0.
which involves the minimization step of the primal vector z          For solving the strongly convex QP (8), the fast gradient
and the update step of the dual vector λ. As shown in [30],       projection method was used in [24], [26]. Inspired by the fact
the convergence of AL can be assured for a large range of         that the Gauss-Seidel method in solving block tridiagonal
values of ρ. Obviously, the larger the penalty parameter, the     linear systems is efficient [32], in this paper we propose
faster the AL algorithm will be, but the condition number         the use of the cyclic CD method to make full use of block
of the Hessian matrix of subproblem (7a) will be larger,          sparsity and avoid the explicit construction of matrix HA .
which in general makes (7a) more difficult to solve. The          Note that in the gradient projection method or fast gradient
convergence rate of the AL algorithm (7) is O(1/k) accord-        projection method [26], the Lipschitz constant parameter
ing to [28] and thus the parameter ρ tends to trade off the       deriving from matrix HA needs to be calculated or estimated
convergence rates to optimality and to feasibility. To improve    to ensure convergence. Therefore, for linear MPC problems
the speed of the AL method, [28] proposed an accelerated          that change at runtime such methods would be less preferable
AL algorithm whose iteration-complexity is O(1/k 2 ) for          than cyclic CD. In this paper, by making full use of the
linearly constrained convex programs by using Nesterov’s          structure of the subproblem, we will implement a cyclic CD
acceleration technique. The accelerated AL algorithm is           method that requires less computations, as we will detail in
summarized in Algorithm 1.                                        the next section.
B. Coordinate Descent Method                                             Procedure 2 One full pass of reverse cyclic coordinate
                                                                         descent on all block variables
   The idea of the CD method is to minimize the objective
function along only one coordinate direction at each iteration,          Input: Λ = {λ1 , . . . , λT }, Û = {û0 , · · · , ûT −1 }, X̂ =
while keeping the other coordinates fixed [33], [34]. In [35],           {x̂0 , x̂1 , · · · , x̂T }; MPC settings A, B, Q, R, ûmin , ûmax ,
[36], the authors showed that the CD method is convergent in             x̂min , x̂max ; Algorithm setting ρ.
convex differentiable minimization problems, and the rate of               1. x̂T ←        CCD           { 1 Q + I, −λT − Ax̂T −1 − B ûT −1 };
convergence is at least linear. We first give a brief introduc-                      x̂T ∈[x̂min ,x̂max ] ρ
tion of the CD method to solve (8). Under the assumption                   2.   ûT −1 ←        CCD             { 1 R + B 0 B, B 0 (λT + Ax̂T −1 −
                                                                                         ûT −1 ∈[ûmin ,ûmax ] ρ
that the set of optimal solutions is nonempty and that the                    x̂T )};
objective function F is convex, continuously differentiable,               3. for t = T − 1, T − 2, . . . , 1 do
and strictly convex with respect to each coordinate, the CD                3.1. x̂t ← CCD        { ρ1 Q + I + A0 A, −(λt + Ax̂t−1 +
method proceeds iteratively for k = 0, 1, . . . , as follows:                           x̂t ∈[x̂min ,x̂max ]
                                                                                B ût−1 ) + A0 (λt+1 + B ût − x̂t+1 )};
                         choose ik ∈ {1, 2, . . . , nz }          (9a)     3.2. ût−1 ← CCD            { ρ1 R + B 0 B, B 0 (λt + Ax̂t−1 −
               zik+1     =                     k
                             argmin F (zik , z6= ik ; λk )        (9b)                     ût−1 ∈[ûmin ,ûmax ]
                              zik ∈Z                                            x̂t )};
where with a slight abuse of notation we denote by                         4. end.
           k                                               k
F (zik , z6= ik ; λk ) the value F (z; λk ) when z6=ik = z6=ik is        Output: Û , X̂.
fixed. The convergence of the iterations in (9) for k → ∞
depends on the rule used to choose the coordinate index ik .
                                                                         Procedure 3 One pass of reverse cyclic coordinate descent
In [36], the authors show that the almost cyclic rule and
                                                                         for Step 3.2 of Procedure 2
Gauss-Southwell rule guarantee convergence. Here we use
the almost cyclic rule, that provides convergence according              Input: λt , ût , x̂t ; MPC settings A, B, R, ûmin , ûmax ;
to the following lemma:                                                 Algorithm setting ρ.
   Lemma 1 ( [36]): Let           zk      be the sequence of               1. Vt ← λt + Ax̂t−1 + B ût−1 − x̂t ;
coordinate-descent iterates (9), where every coordinate                    2. for i = nu , . . . , 1 do
index is iterated upon at least onceon every N successive                  2.1. s ← ρ1 Ri,· ût−1 + (B·,i )0 Vt ;
iterations, N ≥ nz . The sequence z k converges at least                             h                             iûmax,i
linearly to the optimal solution z ∗ of problem (8).                        2.2. θ ← ût−1,i − 1 R +(B      0 B)            ;
                                                                                                         ρ     ii   ii
In this paper we will use the reverse cyclic rule                          2.3. ∆ ← θ − ût−1,i ;
                           ik = nz − (k mod nz )                           2.4. ût−1,i ← θ;
                                                                           2.5. Vt ← Vt + ∆B·,i ;
to exploit the fact that the shifted previous optimal so-
                                                                           3. end.
lution is used as warm start, a rule that clearly satisfies
the assumptions for convergence. The implementation of                   Output: ût−1 .
one pass through all nz coordinates using reverse cyclic
CD is reported in Procedure 2. For given M ∈ Rns ×ns ,
d ∈ Rns , the operator CCD[s,s̄] {M, d} used in Procedure 2              C. Preconditioning
represents one pass iteration of the reverse cyclic CD method
                                                                            Preconditioning is a common heuristic for improving the
through all ns coordinates sns , . . . , s1 for the following box-
                                                                         computational performance of first-order methods. The op-
constrained QP
                                                                         timal design of preconditioners has been studied for several
                      min s0 M s + s0 d                       (10)       decades, but such computation is often more complex than
                     s∈[s,s̄] 2                                          the original problem and may become prohibitive if it must
that is to execute the following ns iterations                           be executed at run time. Diagonal scaling is a heuristic
                   for i = ns , . . . , 1                                preconditioning that is very simple and often beneficial. In
                             h                        is̄i               this paper, we propose to make the change of state variables
                       si ← si − M1i,i (Mi,· s + di )             (11)   x̄ = Ex, where E is a diagonal matrix whose ith entry is
                   end                                                                            q
                                                                                           Ei,i = Qi,i + A0·,i A·,i              (13)
where   [si ]si    is the projection operator
                               s̄i if si ≥ s̄i                          and replace the prediction model xk+1 = Axk + Buk by
                    [si ]si =   si if si < si < s̄i               (12)
                           i                                                                        x̄k+1 = Āx̄k + B̄uk
                                si if si ≤ si

An efficient way of evaluating Step 3.2 (and also Step 2) of             where Ā = EAE −1 and B̄ = EB. The weight matrix Q
Procedure 2 is reported in Procedure 3. Note that Procedure              and constraints [xmin , xmax ] are scaled accordingly by setting
2 requires O(T (n2x + n2u )) arithmetic operations.                      Q̄ = E −1 QE −1 and x̄min = E −1 xmin , x̄max = E −1 xmax .
D. CDAL algorithm                                                    A. AFTI-16 Benchmark Example
                                                                       The open-loop unstable linearized AFTI-16 aircraft model
   The overall solution method described in the previous             reported in [38], [39] is
section is summarized in Algorithm 4, that we call CDAL.                                                               
The CDAL algorithm combines CD with the AL method,                               −0.0151 −60.5651        0    −32.174
                                                                                 −0.0001    −1.3411   0.9929     0
                                                                           ẋ = 0.00018 43.2541 −0.86939                x
making use of the reverse cyclic rule for CD and Nesterov’s              
                                                                                                                 0
acceleration scheme and preconditioning for AL, that we                  
                                                                                    0          0         1       0
                                                                                                    
have specialized for the MPC formulation (2). The AL                                 −2.516   −13.136
(outer) iterations are executed for maximum Nout iterations,                         −0.1689 −0.2514
                                                                            +      −17.251 −1.5766
the CD (inner) iterations for at most Nin iterations. The                
                                                                                       0        0
tolerances out and in are used to stop the outer and                      h               i
                                                                         y = 0      1 0 0
                                                                                     0 0 1 x
inner iterations, respectively. The C-code implementation of                   0
Algorithm 4 is library-free and consists of about 150 lines of
                                                                     The model is sampled using zero-order hold every 0.05 s.
code, which is tested in MATLAB R2020a C-Mex Interface.
                                                                     The input constraints are |ui | ≤ 25◦ , i = 1, 2, the output
                                                                     constraints are −0.5 ≤ y1 ≤ 0.5 and −100 ≤ y2 ≤ 100. The
Algorithm 4 Accelerated reverse cyclic CDAL algorithm for            control goal is to make the pitch angle y2 track a reference
linear (or linearized) MPC                                           signal r2 . In designing the MPC controller we take Wy =
Input: primal/dual warm-start U 0 = {û0 , û1 , · · · , ûT −1 },   diag([10,10]), Wu = 0, W∆u = diag([0.1, 0.1]), and the
X 0 = {x̂0 , x̂1 , · · · , x̂T }, Λ−1 = Λ0 = {λ1 , λ2 , · · · ,      prediction horizon is T = 5.
λT }; MPC settings {A, B, C, Wy , Wu ,W∆u , ∆umin ,                     To investigate the effects of the three critical techniques
∆umax , umin , umax , xmin , xmax }; Algorithm settings              (reverse cyclic rule, acceleration, and preconditioning) that
{ρ, Nout , Nin out , in }                                          we have introduced to improve the efficiency of the CDAL
                                                                     algorithm, we performed closed-loop simulations on eight
  1. α−1 , α0 ← 1; Λ̂1 ← Λ0 ;                                        schemes with fixed ρ = 1: 0-CDAL, the basic scheme,
  2. for k = 1, 2, · · · , Nout do                                   without acceleration and reverse cyclic rule; R-CDAL, the
   2.1. Û ← U k−1 , X̂ ← X k−1 ;                                    scheme with the Reverse cyclic rule; A-CDAL, the Ac-
   2.2. for kin = 1, 2, · · · , Nin do                               celerated scheme; AR-CDAL, the Accelerated scheme with
     2.2.1. Update Û , X̂ by Procedure 2 with Λ = Λk−1 ;            the Reverse cyclic rule, and their respective schemes with
     2.2.2. if kÛ − U k−1 k22 ≤ in and kX̂ − X k−1 k22 ≤ in       preconditioner, namely P-CDAL, P-R-CDAL, P-A-CDAL,
             break the loop;                                         and finally CDAL that includes all the proposed techniques.
                                                                     The stopping criteria are defined by in = 10−6 , out =
   2.3. U ← Û ; X k ← X̂;
                                                                     10−4 , and Nout , Nin are set to the large enough value 5000
   2.4. for t = 1, . . . , T do
                                                                     in order to guarantee good-quality solutions. The closed-loop
     2.4.1. λkt = λ̂kt + Ax̂t−1 + B ût−1 − x̂t ;                    performance of these eight schemes are almost indistinguish-
   2.5. if kΛk − q Λ̂k k22 ≤ out stop;                              able and are reported in Figure 1. It can be seen that the pitch
   2.6. αk ← 2 ( αk−1   4         2
                             + 4αk−1     2
                                      − αk−1 );                      angle correctly tracks the reference signal from 0◦ to 10◦
   2.7. β ← αk−1 − αk ;                                              and then back to 0◦ and satisfies both the input and output
   2.8. Λ̂k+1 ← Λk + β(Λk − Λk−1 );                                  constraints.
                                                                        The computational load associated with the above schemes
  3. end.
                                                                     is listed in Table I, in which the last column represents
 Output: U, X, Λ                                                     the closed-loop performance. In this paper, the closed-
                                                                     loop performance is denoted by the average value of the
                                                                     MPC cost during the whole close-loop runtime, namely      2
                                                                        PT −1                        2                    r
                                                                     T     t=0 kWy (yt+1 − rt+1 )k2 + Wu ut+1 − ut+1             2
                                                                     kW∆u ∆ut k2 . Since each execution time step will require
                IV. N UMERICAL E XAMPLES                             different inner iterations and outer iterations to offer the
                                                                     required tolerances, the “avg” and “max” means the av-
                                                                     erage and maximum iterations (or CPU time) computed
   In this section we test the CDAL solver on two benchmark          over the entire closed-loop execution. It can be seen that
examples used in the Model Predictive Control Toolbox for            the maximum and average number of inner-loop iterations
MATLAB [37]: an ill-conditioned AFTI-16 control problem              of R-CDAL are smaller than that of CDAL (especially
[38], [39] based on LTI-MPC, and MPC of a nonlinear                  the maximum number), while their outer-loop iterations are
CSTR [40] using LPV-MPC based on linearizing the model               almost the same, which shows that the reverse cyclic rule
at each sample step. The reported simulation results were            provides a significant improvement. Although A-CDAL has
obtained on a MacBook Pro with 2.7 GHz 4-core Intel Core             fewer outer-loop iterations, it has more inner-loop iterations
i7 and 16GB RAM.                                                     than CDAL on average. It therefore does not result in a
solution time. Note that in this case the controller is LTI-
                                                                                         MPC, and hence the MPC problem construction and matrix

                                                                                         factorizations required by OSQP can be performed offline.
                                                                                         On the other hand, in case of LPV-MPC problems the total
              0   20    40     60     80     100      120    140   160    180      200

                                                                                         computation time would be spent online and the embedded
                                                                                         code would also include routines for problem construction

                                                                                         and matrix factorization functions. Instead, CDAL does not
              0   20    40     60     80     100      120    140   160    180      200   require any construction nor factorizations, thus making the
                                                                                         solver very lean and fast also in a time-varying MPC setting,
                                                                                         as investigated next.



              0   20    40     60     80     100      120    140   160    180      200   TABLE II: Computational load of CDAL with different
        20                                                                               values of ρ and comparison with OSQP


                                                                                                      ρ     inner iters    outer iters      time (ms)        cost
                                                                                                           avg     max    avg      max     avg    max
              0   20    40     60     80     100      120    140   160    180      200
                                                                                                      1     63     267     10      50      1.0   11.6     42.561
                                                                                                    0.5     46     206     12      60      0.9    8.4     42.590
          Fig. 1: Linear AFTI-16 closed-loop performance                                            0.2     31     123     16      89      0.7    5.1     42.612
                                                                                                    0.1     24     108     18      95      0.6    4.7     42.619
                                                                                                   0.05     18      73     20      85      0.5    4.2     42.618
                                                                                                   0.01      7      44     29     144      0.5    4.4     42.620
significant reduction in total computation time. We can see                                  OSQP            -        -   530    6950      0.6   10.1* 42.627
that AR-CDAL achieves fewer iterations both in the inner                                                                                   1.5   13.8**
loop and outer loop and has better average and worst-case                                     *   : pure solution time, without including matrix factorization
computation performance. It can also be seen from Table I                                          : total time (MPC construction + solution)
that preconditioning significantly reduces the number of
outer-loop iterations.                                                                   B. Nonlinear CSTR Example
TABLE I: Computational performance of different schemes                                     To illustrate the performance of CDAL when the lin-
                                                                                         ear MPC formulation (1) changes at runtime we consider
  method                inner iters     outer iters          time (ms)          cost     the control of the CSTR system [40], described by the
                       avg      max    avg      max         avg    max
                                                                                         continuous-time nonlinear model
  0-CDAL                75    2616     338         2103     6.4    72.4   42.301            dCA                                   −EaR
  R-CDAL                 8      33     340         2102     5.4    67.8   42.574             dt       = CA,i − CA − k0 e T CA
  A-CDAL               144    2616      38          174     5.0    44.9   42.548             dT                                                                     (14)
                                                                                              dt      = Ti + 0.3Tc − 1.3T + 11.92k0 e T CA
  AR-CDAL               58     165      38          177     3.7    39.5   42.670
                                                                                                  y   = CA
  P-CDAL               282    2597      32          170     2.6    16.2   42.534
  P-R-CDAL              27      67      32          170     1.2    15.8   42.525         where CA is the concentration of reagent A, T is the
  P-A-CDAL             303    2597      10           50     2.3    12.4   42.575
  CDAL                  63     267      10           50     1.0    11.6   42.561
                                                                                         temperature of the reactor, CA,i is the inlet feed stream
                                                                                         concentration, which is assumed to have the constant value
                                                                                         10.0 kgmol/m3 . The process disturbance comes from the
   Next, we investigate the effect on computation efficiency                             inlet feed stream temperature Ti , which has slow fluctuations
of parameter ρ, that we expect to tend to trade off feasibility                          represented by Ti = 298.15 + 5 sin(0.05t) K. The manip-
versus optimality. In particular, we expect larger values of                             ulated variable is the coolant temperature Tc . The constants
ρ to favor feasibility, i.e., provide more inner-loop iterations                         k0 = 34930800 and EaR = −5963.6 (in MKS units).
and less outer-loop iterations, and vice versa. The compu-                                  The initial state of the reactor is at a low conversion
tational performance results obtained by performing closed-                              rate, with CA = 8.57 kgmol/m3 , T = 311 K. The control
loop simulations using the final CDAL algorithm for different                            objective is to adjust the reactor state to a high reaction rate
values of ρ between 0.01 and 1 are listed in Table II, along                             with CA = 2 kgmol/m3 , which is a quite large condition. The
with the results obtained by using the OSQP solver [10], that                            MPC controller manipulates the coolant temperature Tc to
we used to compare CDAL with a state-of-the-art first-order                              track a concentration reference as well as reject the measured
method for QP. The tolerances used in OSQP are set to 10−6                               disturbance Ti . Due to its nonlinearity, the model in (14) is
to provide comparable closed-loop performance.                                           linearized online at each sampling step:
   Table II confirms that, as ρ decreases, the number of inner-
loop iterations gets smaller while the number of outer-loop                              dx                      ∂f                              ∂f
                                                                                            ≈ f (xt , ut−1 , p)+                  (x−xt )+                 (u−ut−1 )
iterations gets larger. When the parameter value is between                              dt                      ∂x          xt ,ut−1 ,p         ∂u   xt ,ut−1 ,p
0.01 and 0.1, the CDAL algorithm has very similar compu-                                 where f (x, u, p) is the mapping defined in (14) for x =
tational burden, that is lighter than that of OSQP. Regarding                            [CA T ]0 , u = Tc , p = [CA,i Ti ]0 . By setting Ac =
this latter solver, we split between QP problem construction                              ∂f                    ∂f
time (including the required matrix factorizations) and pure                              ∂x         , Bc = ∂u            , ec = f (xt , ut−1 , p) −
                                                                                             xt ,ut−1 ,p                   xt ,ut−1 ,p
TABLE III: Computational performance of CDAL and OSQP
                        8                                                    reference

                                                                                                    method      inner iters      outer iters    time (ms)        cost
                                                                                                               avg     max      avg     max    avg    max
                                                                                                    CDAL        19      50       12      19    0.4    0.8   0.02202
                        2                                                                           OSQP         -       -       49      49    0.2    0.4* 0.02219
                                                                                                                                               1.5   13.8**
                            0   20   40   60   80   100    120   140   160   180         200
                                                                                                    * : solution time
                                                                                                    ** : MPC construction     time + solution time

                360                                                                T

                340                                                                Ti                                     V. C ONCLUSION
                                                                                                  This paper has proposed a cyclic coordinate-descent
                300                                                                            method in the augmented Lagrangian method framework for
                            0   20   40   60   80   100    120   140   160   180         200
                                                                                               solving linear (possibly parameter-varying) MPC problems.
                                                    time                                       We showed that the method is quite efficient and competing
                                                                                               with other existing methods, thanks to the use of a reverse
                        Fig. 2: Nonlinear CSTR closed-loop performance                         cyclic rule, Nesterov’s acceleration, and a simple heuristic
                                                                                               preconditioner. Besides being easy to code, compared to
                                                                                               many QP solution methods proposed in the literature CDAL
At xt − Bt ut−1 , we get the following linearized continuous-                                  avoids constructing the QP problem and factorizing the
time model                                                                                     resulting matrices, which makes it particularly appealing
                   d                                                                           for LPV-MPC problems. Future research will be devoted to
                      x = Ac x + B c u + e c                                                   extend the method to handle nonlinear MPC problems by
                                                                                               solving a sequence of linearized MPC problem within each
We use the forward Euler method with sampling time Ts =                                        sampling period.
0.5 minutes to obtain the following discrete-time model
                                                                                                                              R EFERENCES
                                      xt+1 = Ad xt + Bd ut + ed                                 [1] S. J. Qin and T. A. Badgwell, “A survey of industrial model predictive
                                                                                                    control technology,” Control engineering practice, vol. 11, no. 7, pp.
                                                                                                    733–764, 2003.
where Ad = I + Ts Ac , Bd = Ts Bc , ed = Ts ec . Although                                       [2] A. Bemporad, M. Morari, V. Dua, and E. N. Pistikopoulos, “The ex-
held constant over the prediction horizon, clearly matrices                                         plicit linear quadratic regulator for constrained systems,” Automatica,
Ad , Bd and the offset term ed change at runtime, which                                             vol. 38, no. 1, pp. 3–20, 2002.
                                                                                                [3] D. Kouzoupis, G. Frison, A. Zanelli, and M. Diehl, “Recent advances
makes the controller an LPV-MPC. Regarding the perfor-                                              in quadratic programming algorithms for nonlinear model predictive
mance index, we choose weights Wy = 1, Wu = 0,                                                      control,” Vietnam Journal of Mathematics, vol. 46, no. 4, pp. 863–882,
W∆u = 0.1. The physical limitation of the coolant jacket                                            2018.
                                                                                                [4] Y. Wang and S. Boyd, “Fast model predictive control using online op-
is that its rate of change ∆Tc is subject to the constraint                                         timization,” IEEE Transactions on control systems technology, vol. 18,
[−1, 1] K when considering the sampling time Ts = 0.5                                               no. 2, pp. 267–278, 2009.
minutes. The prediction horizon is T = 10 steps.                                                [5] S. Wright, “Efficient convex optimization for linear MPC,” in Hand-
                                                                                                    book of Model Predictive Control. Springer, 2019, pp. 287–303.
   We compare again CDAL and OSQP in the LPV-MPC                                                [6] H. J. Ferreau, H. G. Bock, and M. Diehl, “An online active set strategy
setting described above. CDAL is run with in = 10−6 ,                                              to overcome the limitations of explicit MPC,” International Journal of
out = 10−4 , ρ = 0.01, and Nout = Nin = 5000, which                                                Robust and Nonlinear Control: IFAC-Affiliated Journal, vol. 18, no. 8,
                                                                                                    pp. 816–830, 2008.
is a large number that we chose to optimize closed-loop                                         [7] A. Bemporad, “A quadratic programming algorithm based on nonneg-
performance at best. The tolerances used in OSQP are set to                                         ative least squares with applications to embedded model predictive
10−6 to provide comparable closed-loop performance. The                                             control,” IEEE Transactions on Automatic Control, vol. 61, no. 4, pp.
                                                                                                    1111–1116, 2016.
closed-loop simulation results of CDAL and OSQP almost                                          [8] P. Patrinos and A. Bemporad, “An accelerated dual gradient-projection
coincide and are plotted in Figure 2, from which it can                                             algorithm for embedded linear model predictive control,” IEEE Trans-
be seen that CA tracks the reference signal well, and the                                           actions on Automatic Control, vol. 59, no. 1, pp. 18–33, 2013.
                                                                                                [9] S. Boyd, N. Parikh, and E. Chu, Distributed optimization and statisti-
fluctuation of Ti is effectively suppressed.                                                        cal learning via the alternating direction method of multipliers. Now
   The computational load and closed-loop performance as-                                           Publishers Inc, 2011.
sociated with CDAL and OSQP are reporeted in Table III.                                        [10] B. Stellato, G. Banjac, P. Goulart, A. Bemporad, and S. Boyd,
                                                                                                    “OSQP: An operator splitting solver for quadratic programs,” Math-
It is apparent that CDAL leads to a slightly smaller cost                                           ematical Programming Computation, vol. 12, pp. 637–672, 2020,
and that, more important, a shorter execution time (both in                                         http://arxiv.org/abs/1711.08013, Code avaliable at https://github.com/
the average and worst-case) than OSQP, due to the need of                                           oxfordcontrol/osqp. Awarded best paper of the journal for year 2020.
                                                                                               [11] W. Li and J. Swetits, “A new algorithm for solving strictly convex
constructing the QP problem and factorizing the resulting                                           quadratic programs,” SIAM Journal on Optimization, vol. 7, no. 3, pp.
matrices to use the latter method.                                                                  595–619, 1997.
[12] B. Hermans, A. Themelis, and P. Patrinos, “QPALM: a newton-type            [36] ——, “On the convergence of the coordinate descent method for
     proximal augmented lagrangian method for quadratic programs,” in                convex differentiable minimization,” Journal of Optimization Theory
     2019 IEEE 58th Conference on Decision and Control (CDC), 2019,                  and Applications, vol. 72, no. 1, pp. 7–35, 1992.
     pp. 4325–4330.                                                             [37] A. Bemporad, M. Morari, and N. L. Ricker, “Model predictive control
[13] A. Bemporad, “A numerically stable solver for positive semi-definite            toolbox,” User’s Guide, Version, vol. 2, 2004.
     quadratic programs based on nonnegative least squares,” IEEE Trans-        [38] P. Kapasouris, M. Athans, and G. Stein, “Design of feedback control
     actions on Automatic Control, vol. 63, no. 2, pp. 525–531, 2018.                systems for stable plants with saturating actuators,” in Proc. 27th IEEE
[14] N. Saraf and A. Bemporad, “A bounded-variable least-squares solver              Conf. on Decision and Control, Austin, Texas, U.S.A., 1988, pp. 469–
     based on stable QR updates,” IEEE Transactions on Automatic Con-                479.
     trol, vol. 65, no. 3, pp. 1242–1247, 2020.                                 [39] A. Bemporad, A. Casavola, and E. Mosca, “Nonlinear control of
[15] ——, “An efficient non-condensed approach for linear and nonlinear               constrained linear systems via predictive reference management,”
     model predictive control with bounded variables,” arXiv e-prints, pp.           IEEE transactions on Automatic Control, vol. 42, no. 3, pp. 340–349,
     arXiv–1908, 2019.                                                               1997.
[16] C.-J. Hsieh, K.-W. Chang, C.-J. Lin, S. S. Keerthi, and S. Sundararajan,   [40] D. E. Seborg, D. A. Mellichamp, T. F. Edgar, and F. J. Doyle III,
     “A dual coordinate descent method for large-scale linear SVM,” in               Process dynamics and control. John Wiley & Sons, 2010.
     Proceedings of the 25th international conference on Machine learning,
     2008, pp. 408–415.
[17] K.-W. Chang, C.-J. Hsieh, and C.-J. Lin, “Coordinate Descent Method
     for Large-scale L2-loss Linear Support Vector Machines.” Journal of
     Machine Learning Research, vol. 9, no. 7, 2008.
[18] P. Richtárik and M. Takáč, “Distributed coordinate descent method for
     learning with big data,” The Journal of Machine Learning Research,
     vol. 17, no. 1, pp. 2657–2681, 2016.
[19] Y. Xu and W. Yin, “A block coordinate descent method for regularized
     multiconvex optimization with applications to nonnegative tensor
     factorization and completion,” SIAM Journal on imaging sciences,
     vol. 6, no. 3, pp. 1758–1789, 2013.
[20] Y. Nesterov, “Efficiency of coordinate descent methods on huge-scale
     optimization problems,” SIAM Journal on Optimization, vol. 22, no. 2,
     pp. 341–362, 2012.
[21] Y. T. Lee and A. Sidford, “Efficient accelerated coordinate descent
     methods and faster algorithms for solving linear systems,” in 2013
     IEEE 54th Annual Symposium on Foundations of Computer Science.
     IEEE, 2013, pp. 147–156.
[22] Z. Allen-Zhu, Z. Qu, P. Richtárik, and Y. Yuan, “Even faster acceler-
     ated coordinate descent using non-uniform sampling,” in International
     Conference on Machine Learning. PMLR, 2016, pp. 1110–1119.
[23] D. Leventhal and A. S. Lewis, “Randomized methods for linear
     constraints: convergence rates and conditioning,” Mathematics of Op-
     erations Research, vol. 35, no. 3, pp. 641–654, 2010.
[24] S. Richter, C. N. Jones, and M. Morari, “Computational complexity
     certification for real-time MPC with input constraints based on the fast
     gradient method,” IEEE Transactions on Automatic Control, vol. 57,
     no. 6, pp. 1391–1403, 2011.
[25] V. Nedelcu, I. Necoara, and Q. Tran-Dinh, “Computational complex-
     ity of inexact gradient augmented Lagrangian methods: application
     to constrained MPC,” SIAM Journal on Control and Optimization,
     vol. 52, no. 5, pp. 3109–3134, 2014.
[26] M. Kögel and R. Findeisen, “Fast predictive control of linear systems
     combining Nesterov’s gradient method and the method of multipliers,”
     in 2011 50th IEEE Conference on Decision and Control and European
     Control Conference. IEEE, 2011, pp. 501–506.
[27] Y. Nesterov, “A method of solving a convex programming prob-
     lem with convergence rate O(1/k2 ),” Soviet Mathematics Doklady,
     vol. 27, no. 2, pp. 372–376, 1983.
[28] B. He and X. Yuan, “On the acceleration of augmented Lagrangian
     method for linearly constrained optimization,” Optimization online,
     vol. 3, 2010.
[29] D. P. Bertsekas, “Nonlinear programming,” Journal of the Operational
     Research Society, vol. 48, no. 3, pp. 334–334, 1997.
[30] ——, Constrained optimization and Lagrange multiplier methods.
     Academic press, 2014.
[31] Y. Nesterov, “Smooth minimization of non-smooth functions,” Math-
     ematical programming, vol. 103, no. 1, pp. 127–152, 2005.
[32] P. Amodio and F. Mazzia, “A parallel Gauss–Seidel method for block
     tridiagonal linear systems,” SIAM Journal on Scientific Computing,
     vol. 16, no. 6, pp. 1451–1461, 1995.
[33] D. G. Luenberger, “Linear and nonlinear programming Addison-
     Wesley,” Reading, MA, 1984.
[34] J. M. Ortega and W. C. Rheinboldt, Iterative solution of nonlinear
     equations in several variables. SIAM, 2000.
[35] Z.-Q. Luo and P. Tseng, “On the convergence of a matrix splitting
     algorithm for the symmetric monotone linear complementarity prob-
     lem,” SIAM Journal on Control and Optimization, vol. 29, no. 5, pp.
     1037–1060, 1991.
