NUMERICAL ALGORITHMS FOR STATE-LINEAR OPTIMAL IMPULSIVE CONTROL PROBLEMS BASED ON FEEDBACK NECESSARY OPTIMALITY CONDITIONS

We propose and compare three numeric algorithms for optimal control of state-linear impulsive systems. The algorithms rely on the standard transformation of impulsive control problems through the discontinuous time rescaling, and the so-called “feedback”, direct and dual, maximum principles. The feedback maximum principles are variational necessary optimality conditions operating with feedback controls, which are designed through the usual constructions of the Pontryagin’s Maximum Principle (PMP); though these optimality conditions are formulated completely in the formalism of PMP, they essentially strengthen it. All the algorithms are non-local in the sense that they are aimed at improving non-optimal extrema of PMP (local minima), and, therefore, show the potential of global optimization.


Introduction
Our study lays in the vein of (relatively recent) works [Dykhta, 2014;Dykhta, 2015], where a new sort of necessary optimality conditions is developed for classical and non-smooth optimal control problems. Such conditions, based on the technique of modified Lagrangians, operate with a particular, "extremal" class of feedback controls, and appear to be much "closer" to sufficient conditions (and dynamic programming) than the classical PMP does.
This paper follows our recent works [Sorokin and Staritsyn, 2018;Staritsyn and Sorokin, 2019], where different versions of the so-called feedback maximum (or minimum) principle were obtained for some classes of impulsive (and connected to them continuous and discrete-time) control problems (see also [Dykhta and Samsonyuk, 2018;Sorokin, 2014]). Now, based on the previous theoretical results, we develop numeric algorithms for optimal control, which demonstrate a potential of global optimization techniques. Here, we concentrate on a state-linear case, which enjoys a sort of "duality", enabling us to employ the dual necessary condition along with the direct one (in some cases, the dual approach seems to be advantageous compared to the direct one). The feedback optimality conditions, we use in this paper, are formulated for an auxiliary continuous optimal control problem, and require "post-discretization". A similar approach for a nonlinear pre-discretized impulsive control problem had been presented in [Sorokin and Staritsyn, 2018].

Problem Statement
We address a class of optimal control problems, which are linear in state variable and involve both usual (measurable uniformly essentially bounded functions) and impulsive (distributions or signed Borel measures) control inputs: Here, ·, · denotes the scalar product in R n ; c, x 0 ∈ R n are given vectors; U ⊂ R m is compact, and A, B : U → R n×n , a, b : U → R n are given matrix-and vectorvalued functions, assumed to be Borel measurable. Controls u are functions of class L ∞ (T , U ), while v are right continuous on [0, T ) functions T → R of bounded variation (BV + (T , R)); the derivativev shall be understood in the generalized sense, i.e., as a signed Borel measure (or rather, first order distribution); Var T v(·) is the total variation of v on T .
Following the standard methodology [Miller and Rubinovich, 2003], one reduces impulsive system (1)-(3) to an ODE driven by uniformly bounded controls. This reduction, based on an appropriate Lipschitzian reparameterization of the time variable, is well-known and rather typical for impulsive control theory. For brevity, we drop the details and refer to [Miller and Rubinovich, 2003;Dykhta and Samsonyuk, 2000;Zavalishchin and Sesekin, 1997]. As a result of the transformation, we obtain the following terminally-constrained classical optimal control problem (P ): where controls are w .
, and trajectories are z .
= (x, y) ∈ W 1,1 (T , R n × R + ). Note that (P ) is equivalent to the original problem, stated on solutions of (1)-(3), i.e., any minimizing sequence of controls in one problem produces a minimizing sequence in the other one. We shall stress that (P ) is weighted by a scalar terminal constraint y(T ) = y T .
= (x, y, u, v) is a (control) process of system (4), (5). A process is called admissible as soon as it satisfies (4)-(6). Thanks to the linearity in x and Borel measurability of A, B, a, b, problem (P ) has a minimizer within the class of admissible processes, which implies that the original impulsive control problem also has an optimal solution.
2 Theoretical background of the algorithms: Feedback Maximum Principles Prior to presenting the announced numeric algorithms we shall introduce some necessary objects, and recall the basic theoretical background related to feedback necessary optimality conditions (further details can be found in Staritsyn and Sorokin, 2019]). We start with usual ingredients of PMP.

Adjoint System and Hamiltonians
The Pontryagin function (the non-maximized Hamiltonian) of problem (P ) writes are the "partial Hamiltonians" (notice that H is independent of y). Then, the adjoint (dual) equation takes the formψ where ξ = const is dual of y (for ξ, there is no transversality condition, dictated by the Maximum Principle). The maximized Hamiltonian is easily calculated as where H 0,1 . = max u∈U H 0,1 are maximized partial Hamiltonians. The maximizers of H in u and v are the multifunctions otherwise.

Feedbacks
Below, we shall deal with feedback controls which are assume to be measurable in t. By Z(w) we denote the set of both Carathéodory and Krasovskii-Subbotin (sampling) solutions [Krasovskii and Subbotin, 1988;Clarke et al., 1998] of system (4), (5), associated to w. Recall that at least one sampling solution exists for any feedback w, which implies Z(w) = ∅.
As is obvious, functions z ∈ Z(w) generically loose to satisfy the terminal constraint y(T ) = y T . This requires addressing the "corrected" multifunctions, which guarantee the mentioned property Staritsyn and Sorokin, 2019;Sorokin and Staritsyn, 2018]: on Ω 3 . Here,

Direct and Dual Feedback Maximum Principles
Now we shall fix an admissible processσ = (z = (x,ȳ),w), whose optimality is the question of interest.

Numeric Algorithms
In this section, we consider an explicit Euler discretization of dynamical systems (4), (5), (7), (10) with a uniform partition {0, 1, 2, . . . , N } of the time interval [0, T ]. The time lag is h = T /N . All control, state and adjoint functions are assumed to be defined at the nodes of the partition grid.

Direct Algorithm A
The direct algorithm is based on Theorem 2.1.
Step A0 (initialization). Fix the accuracy ε > 0 (this parameter of the algorithm measures the depth of control improvement, and defines the exit of the iterative process).
Step A4 (simultaneous calculation of the feedback control w = (u, v) and respective trajectory z w = (x w , y w )).
The outcome of this cycle is a control process σ w := (z w , w w ).

Dual Algorithm B
Steps B0 and B1 coincide with Steps A0 and A1, respectively.
Step B2 is the same as Step A3.
The outcome of this cycle is a pair δ = (ζ ω , w ω ).
Step B4. If then setw := w ω , and return to Step B1. Otherwise, go to Step B2.
Step C6 is the same as Step A3.
Step C7 is equivalent to Step A4 with ψ ω from Step C3 instead ofψ.
Step C8. If I(σ w ) = c, x w (N ) ≤ c,x(N ) = I rec , then setw := w w and return to Step C1. Otherwise, go to Step C6.
The "mixed" algorithm combines the direct and dual maximum principles: Steps C2, C3 correspond to construction of feedback controls ω = (υ, ν) from Theorem 2.2, while Steps C6, C7 involve feedback controls w = (u, v) of the type, we met in Theorem 2.1.

Examples
The variational problem, addressed in this paper, presents the simplest class of nonconvex optimal impulsive control problems with states of bounded variation, from which the theory of dynamic optimization with discontinuous solutions actually starts. Meanwhile, problems of this class arise in different models of physical processes, some of which can be found, e.g., in [Dykhta and Samsonyuk, 2000;Zavalishchin and Sesekin, 1997].
As an example, we discuss below a simple model from the laser technology.

Example 1: Maximize the excitation of two-level atom
Consider the following singular bilinear problem: Here, a, b are parameters, 0 < b ≤ 2a. The system describes the dynamics of a resonant approximation of an atom, whose state can vary between the basic and the excited levels, subject to a control resonant electromagnetic field. The input v is a linear function of the amplitude of a polarized light wave (for details, we refer to [Dykhta and Samsonyuk, 2000]), and the performance criterion presents an averaged population of the upper atomic level.
In the absence of the constraint on the total variation of control v, a complete analysis of this model is carried out in [Dykhta and Samsonyuk, 2000] by the variational maximum principle, which is not more formally applicable in our case.
To investigate the problem numerically, we apply algorithms A-C. Passing to the notation of problem (P ) the above model rewrites: Taken different values of the parameters a, b, T, M , we obtain, as a result of a series of computations, a similar qualitative picture, whose profile corresponds to a single impulse at the initial time moment, which agrees with the analytical solution [Dykhta and Samsonyuk, 2000]. For a = b = T = 1, M = 3, and v ≡ 0.75, chosen as the initial (admissible) control, the resulted process is plotted on Figs 1-3. The control sequences, generated by all algorithms A-C converge to the same solution, but the number of iterations essentially depends on the value of the parameter ξ (see, e.g., the table below).    The following academic example is aimed at demonstrating the global optimization potential of algorithms A-C, in comparison with the direct method, involving popular solvers such as IPOPT, APOPT and BPOPT.
Example 2: Discarding of a strict local extremum In [Staritsyn and Sorokin, 2019], we find the following non-convex variational problem which is equivalent to the pre-impulsive model x 2 (2) → min, As a matter of comparison, we applied the free academic package GEKKO for Python 3 [Beal, Hill, Martin and Hedengren, 2018], which automatically reduces an optimal control problem to NLP through discretization in time. The terminal constraint y(T + M ) = T is handled by quadratic penalization. The computations were carried out in the remote mode (APMonitor, Version 0.9.2), involving the IPOPT, APOPT and BPOPT as internal NLP solvers. As an outcome of multiple numeric experiments, the same control v ≡ 1/2 was found, which is known to be a local Pontryagin extremal with the cost I = 0.
Starting from this local solution, all three algorithms A-C produce, in a single iteration (!), a process with the cost I = −6 and states/controls presented on Figs 4, 5.

Conclusion
It is important to stress that the direct and dual feedback maximum principles are independent one of another in the sense that a process satisfying one of them shall not satisfy another one  (see also [Dykhta, 2014;Sorokin, 2014]). As some examples show, this feature is inherited by the respective Algorithms A and B. Thus, a combination of the direct and dual approaches in the spirit of Algorithm C could be a promising way. Furthermore, such a combination is natural from the very "machinery" viewpoint. Indeed, given the initial control process, Theorem 2.1 produces a state of a new, "better" process σ, which can be used as the initial data of Theorem 2.2. Next, the outcome of the dual feedback maximum principle, i.e., the adjoint state of an "improving" process, could be used as an input of the direct feedback maximum principle.