ABOUT CONTROL OF THE GUARANTEED ESTIMATION

The control problem by parameters in the course of the guaranteed state estimation of linear non-stationary systems is considered. It is supposed that unknown disturbances in the system and the observation channel are limited by norm in the space of square integrable functions and the initial state of the system is also unknown. The process of guaranteed state estimation includes the solution of a matrix Riccati equation that contains some parameters, which may be chosen at any instant of time by the first player (an observer) and the second player (an opponent of the observer). The purposes of players are diametrically opposite: the observer aims to minimize diameter of information set at the end of observation process, and the second player on the contrary aims to maximize it. This problem is interpreted as a differential game with two players for the Riccati equation. All the choosing parameters are limited to compact sets in appropriate spaces of matrices. The payoff of the game is interpreted through the Euclidean norm of the inverse Riccati matrix at the end of the process. A specific case of the problem with constant matrices is considered. Methods of minimax optimization, the theory of optimum control, and the theory of differential games are used. Examples are also given.


Introduction
State estimation problems for linear non-stationary systems are well studied by now (see, e.g. [Kurzhanski and Vályi, 1996;Kurzhanski and Varaiya, 2014], and the bibliography there). The main mathematical apparatus for solution is connected here with the theory of control and estimation under uncertainty. In the special case of estimation with integral constraints for disturbances, the basic relations are quite similar to the equations of the well-known Kalman filter. But in the determinate theory the main object of investigation is the information set. The diameter of this set may serve as the quality characteristic of the observation process. The first player (an observer) tries to minimize this diameter, and the second player (an opponent of the observer) aims to maximize it. Both players can choose the parameters that lie in compact sets of matrices at any instant of time. Thus, the problem may be interpreted as a differential game for the Riccati equation of the process. As the diameter of the information set is proportional to the Euclidean norm of the inverse Riccati matrix, the mentioned value is taken as the payoff of the game. We consider the differential game in the class 'counterstrategy/strategy' and use the approach connected with the Hamilton-Jacobi-Bellman-Isaacs (HJBI) equation, [Krasovskii and Subbotin, 1988;Subbotin, 1999;Fleming, Soner, 2006]. We offer two ways that overcome the lack of the Lipschitz conditions of Riccati equations and suggest a numerical scheme for the solution of the problem. Note that problems of observations' control were considered in different aspects in [Grigoryev et al., 1986;Kurzhanski and Vályi, 1996;Kurzhanski and Varaiya, 2014;Ananiev, 2011;Ananyev, 2015]. The results of the work may be used both for quality improvement of measuring systems and for creation of counteraction systems of observation.
It is necessary to tell that sometimes methods of the guaranteed estimation are criticized by estimation specialists. The main argument is that the nature as the opponent not always is malicious. However the problem considered here is not a game with the nature. This is a game between two players: the observer and his opponent.
A few words about the physical sense of the problem.
The following situation can serve as a motivation for our task. Suppose that an flying object with attackable hindrances moves above grade and is managed by the player-opponent (the 2nd player) that is traced by the player-observer (the 1st player) for the purpose of the most exact definition of position of the object in fixed instant (perhaps, for its guaranteed destruction). At the same time the 1st player can use different vision facilities and the additional devices reducing the disturbances of the object. On the other hand, the 2nd player interfering the purposes of the 1st one uses active hindrances to supervision (false objects, thermal rockets, etc.) for increase the mistake of the observation. Let us give Example 1. Consider the rectilinear movement of an airplane in the vertical plane at h height: The real initial statex 0 ∈ R 4 may differed from [L; h; V ;ḣ] and be unknown. The deviation from the basic movement is described by the systeṁ The control accelerations are limited by the constraint u 2 1 + u 2 2 ≤ 1/2. The model of measurements is of the formỹ = √ (L −x 1 ) 2 +x 2 2 . Setting y(t) = y(t) − y nom (t) and linearizing this with respect to the nominal trajectory, we obtain where v(t) is an observation error with the constraint ∫ T 0 v 2 (t)dt ≤ 1. Let us assume that the acceleration and parameters h, L, V are constant and known for the observer. The observer can also use the function c(t), c 1 ≤ c ≤ c 2 , in order to improve the estimation. From the other hand, the constant velocity V and the accelerations u 1 , u 2 may be chosen by the second player before the observation process for the purpose to interfere with plans of the observer. Another possible problem formulation on the contrary assumes that functions g 1 , g 2 may be already chosen by the observer, and function c(t) is used by the second player for the worst noise generation in observation. Note that in this example the system does not contain the disturbances but in common case it does.
This work is an extended and improved version of report [Ananyev, 2017]. The paper is organized as follows. Section 2 is devoted to the background of guaranteed estimation. In section 3 our problem is formulated. In section 4 we consider the most simple case of constant matrices in the system. A special attention is paid to conclusions under steady-state solutions of the Riccati equation. In section 5 we pass to the common case. Concepts of strategy, and counterstrategy are reminded. The HJBI equation is written down, and the possibility of its solution in generalized sense is discussed. In the last section, problems of numerical solution are considered and some examples are given.

Guaranteed Estimation
In this work, we consider the linear non-stationary containing an uncertain function v(·), where x(t) ∈ R n is a state vector, y(t) ∈ R m is an output, v(t) ∈ R k is an uncertain disturbance, u(t) is a known function; A(·), B(·), G(·), c(·) are bounded Borelian matrices. A matrix A(·) is a Borelian one, if every its element a ij (·) is a Borel measurable function. Suppose that the uncertain function v(·) in (1) and (2) is constrained by the inequality where | · | is the Euclidean norm. Besides, the matrix c(·) must satisfy the condition where δ > 0 and I m ∈ R m×m is the identity matrix. Hereafter the symbol ′ means the transposition. According to general theory of guaranteed estimation [Kurzhanski and Varaiya, 2014] let us give Definition 1. The collection X T (y) of state vectors {x(T )} is said to be the information set if for any x ∈ X T (y) there exists a function v(·) satisfying (3) and such that equalities (1), (2) hold almost everywhere (a.e.) with x(T ) = x.
It is easily seen that x ∈ X T (y) iff there exists a function f (·) satisfying (5) and subjecting to the equatioṅ with final condition x(T ) = x. On the other hand, such a function exists iff the minimum of the left-hand side of inequality (5) is less or equal 1. Thus, using standard optimization reasonings, we come to the conclusion.

Lemma 1. The information set has the form
, where parameters may be found from equationṡ , which satisfy the equationṡ The valuex(T ) is the center of bounded ellipsoid X T (y). A simple sufficient condition for invertibility of matrix P (t) on (0, T ] is the following. It is well-known (see [Kurzhanski and Varaiya, 2014]) that Assumption 1 is equivalent to full observability of the 3 Problem Formulation Consider our observation process as a differential game for Riccati equation (7). This may be described as follows. Let matrices B, G, and c in equations (1), (2) depend on three arguments: time t, a, and b. Parameters a and b (may be functions of t) belong to compact sets in finite dimensional spaces: Assume that matrix functions B(t, ·, ·), G(t, ·, ·), c(t, ·, ·) are continuous for every t and functions B(·, a, b), G(·, a, b), c(·, a, b) are bounded and measurable for every possible a, b. Condition (4) holds as before. At any instant t, the parameter b in (12) can be chosen by a second player (opponent) who tries to make the worse quality of observation process. On the other hand, the parameter a can be chosen by a first player (observer) who tries to make the best quality of observation process. Both players evaluate the quality of observation by the terminal payoff γ(X T (y)), where γ(·) is a non-negative continuous function defined on all compact sets in R n . The continuity is understood in the sense of Hausdorff's convergence.
|P x| is the Euclidean norm of matrix P that does not depend on the function u(·) in equation (1). The value of maximal deviation max . So, the first player tries to minimize the payoff, and his opponent tries to maximize. The formulas for payoff (13) result from the expression of support function of the set X T (y), ρ(l|X T (y)) = max value h(T ) may be selected by the second player who supposes h(T ) = 0 to maximize the diameter. Moreover, we believe that P (T ) > 0. Otherwise, we can choose the pseudoinverse matrix P − instead of P −1 . By the way, it is not necessary because Assumption 1 always will be presumed further. Besides, suppose that both players choose their control parameters in (12) since some instant t 0 , 0 < t 0 < T . On the initial segment [0, t 0 ] the parameters of (12) are equal to some known values from the set A, B. Therefore, the observer gets the set X 0 = X t0 (y) at the instant t 0 . It can turn out that h(t 0 ) = 1. In this case the resource of disturbances is exhausted and we have v(t) = 0 for t ≥ t 0 . If so, we obtain x(t 0 ) =x(t 0 ) and y(t) ≡ G(t)x(t) for t ≥ t 0 . Our problem reduces to the ordinary control one for a linear system (1) with complete information on the state vector. Of course, the second player can maximize the payoff (13) with the help of u(·) when the payoff is the maximal deviation. But in this paper, as a rule, we consider the function u(·) known to the observer. If h(t 0 ) < 1, we obtain more complicated situation. It is easily seen from equations (10), (11), that the evolution of information set X t (y) depends only on the control parameters in (12) and the innovation function . We assume that the function w(·) is chosen by Nature and players can not influence it. But, for simplicity in this paper, the Nature does its choice only on the initial segment [0, t 0 ], and after that the vector function w(·) can be chosen by the second player who maximizes the diameter and supposes w(·) = 0. So, we obtain the differential game with complete information for Riccati equation (7) and equation (10), where no controls are present. Therefore, we can suppose that our payoff γ(X T (y)) = s(P −1 (T )) depends only on the inverse matrix P −1 (T ). The function s(·) is continuous. For convenience we denote by Q(t) the matrix P −1 (t) and obtain the equatioṅ Further we deal with equation (14). The exact definition for the control strategies of the players is given below. Note that we mostly play for the second player and admit positional counterstrategies a(t, Q, b) of the observer who discriminates against his opponent.

Optimization of Riccati Equation with Time-
Invariant Parameters Let all the matrices in relations (1), (2), (4), and (12) be time-invariant. Moreover, we use the constant parameters a, b in this section. Consider the low value γ * = max b min a s(Q(T )) of the game and its upper value γ * = min a max b s(Q(T )). Always we have γ * ≥ γ * , and the strong inequality γ * > γ * may be realised. From now on, we use the standard notation from Matlab, where [A 1 , . . . , A k ] means the rowconcatenation of matrices of appropriate dimensions (sometimes, the comma is replaced by the blank), and [A 1 ; . . . ; A k ] means the column-concatenation.
Let us solve our game in the class 'counterstrategy/strategy', when the first player may use any functions a(b) ∈ A. In this case, the game has a saddle point (see [Krasovskii and Subbotin, 1988;Fleming, Soner, 2006]) and the value of game  Theorem 1. Under assumption 2 the optimization is fulfilled only on the parameter a ∈ A, and the value of the game equals γ = γ * = min a |P −1 (T )|, where matrices r * , c * are substituted in equation (7).
Proof. First note that the matrix P (t) is a solution of the following linear-quadratic problem Taking into account the compactness of A, B and the continuity of the payoff, we prove the theorem. Now consider stationary solutions of Riccati equation (7). Such solutions arise under very long time of observation. We make Assumption 3. The systemẋ = Ax + BC 1 f, y = Gx + cv, where A = A − Bc ′ CG (see Assumption 1 and equation (6)), is completely observable and completely controllable, i.e. rank [ It is known [Liptser and Shiryayev, 2001] that under assumption 3 there exists a unique positive-definite solutions of stationary Riccati equations and Q = P −1 . Condition (16) may be considered as an equality condition for minimax problem (15).
The assumption 3 holds, but assumption 2 does not hold. It is required to find the value of the game and optimal counterstrategies a * (b) ∈ R 2 delivering the minimum in (15) and optimal strategies b * ∈ R 2 . The value of the game is approximately equals 1.1305. It is reached at a * = [0; 0.28], b * = [0.2120; 0.4920]. The optimal functions a * 1 , a * 2 are shown on Fig. 1 and 2.
The program for this example uses a grid on uncertain parameters with the step δ = 0.05.

Common Case
At first, let us return to example 1. 2. The first player observes a signal y(t) = g 1 (t)x 1 + g 2 (t)x 2 + cv(t), where the disturbance v(t) = sin t/ √ 2T − sin 2T satisfies the constraint. The system is completely observable but the matrix P (t) > 0 is very ill-conditioned here. Nevertheless, the observer can find his optimal strategy depending on the velocity. Its spline approximation on ten data is shown on Fig. 3. The maxmin of the deviation is equal to 229.64. It is reached at c = 0.78 and V = 122, u = [−0.27; 0.65]. On Fig. 4 we see the very narrow projection of the set X T (u, y) on the plane x 1 , x 2 , where the x is the center of ellipsoid, star near the center is the real position, and the second star is the projection of maximizer. Note that under s(Q(T )) = |P −1 (T )| the optimal strategy of the observer is constant and equals c = 0.7. It does not depend on V , it follows from equations (7), (14). Now we pass to positional strategies depending on t. Unfortunately, Riccati equations (7) isfy the Lipschitz condition. These ones are present in almost all works on differential games [Krasovskii and Subbotin, 1988;Souganidis, 1985;Souganidis, 1999;Subbotin, 1999;Taras'ev et. al., 2006;Fleming, Soner, 2006]. Nevertheless, we can overcome this difficulty at least in two ways. Remember that we deal with equation (14) (16) , which may be satisfied by two solutions of the linear matrix equationṡ where The initial conditions for equations (17) may be chosen as M 0 = Q 0 = Q(t 0 ), N 0 = I n . From now on, we accept the analog of assumption 3 for non-stationary systems.
Under assumption 4, it follows from [Liptser and Shiryayev, 2001] that the matrix Q(t) is nonsingular for any t ∈ (0, T ]. Moreover, due to the compactness of constraints (12), the nonsingularity will be uniform on any time interval [t 0 , T ], t 0 > 0, with respect to all measurable parameter functions. The same may be told about matrices M (t) and N (t). Thus, our differential game is reduced to the game with linear matrix equations (17). The payoff γ(T ) = s(Q(T )) of the game is a continuous function of the final state if wherex(T ) is the end of trajectory of the homogeneous equationẋ(t) = A(t)x(t) with the initial statex(t 0 ). The initial state is known. Any functions a(t, M, N, b) ∈ A of t, the state {M, N }, and the parameter b ∈ B satisfying the constraints, will be consider for the strategies of the first player who tries to minimize γ(T ). The strategies of the second player, who tries to maximize γ(T ), are any functions b(t, M, N ) ∈ B. The controls of the first player are said to be counterstrategies, [Krasovskii and Subbotin, 1988]. The solution of the (17) is defined step-bystep with the help of piecewise-constant controls as in [Krasovskii and Subbotin, 1988], [Subbotin, 1999, p. 7]. In these works, a concept of the value of the game in the class 'counterstrategy/strategy' is explained in detail. The game has the saddle point and the value c(t 0 , M 0 , N 0 ), where t 0 ∈ (0, T ], if the game begins from the position (t 0 , M 0 , N 0 ). In our problem, one needs to find a saddle point (a value of the game) and corresponding optimal strategies a * (·, ·, ·, ·), b * (·, ·, ·). For problem's solution one need to build a function c(t, M, N ) giving the value of the game under different initial positions (t, M, N ). Under assumptions 4 we may suppose that there exist constants α, β, such that 0 < α ≤ P (t), P −1 (t) ≤ β for all t ∈ [t 0 , T ], t 0 > 0, and for all control parameters. At the final instant the boundary condition c(T, M, N ) = |M N −1 | must hold. As in section 4, the game become simpler under the following

Assumption 5. Let the compact sets A(t), B(t) in (12) may depend on time and let c(t) = c(t, b), B(t) = g(t, a)r(t, b), G(t) = G(t, a). The constraints satisfy the conditions: there is a function
Besides, the relation B(t)c ′ (t) = 0 must hold. Theorem 2. Under assumption 5 the game is reduced to a problem of optimal control over the functions a(t) ∈ A(t). The value of the game equals γ = γ * = min a(·) |Q(T )|, where matrix functions r * (t), c * (t) are substituted in equation (14).
Proof. The proof is similar to Theorem 1. Indeed, the matrix Q(t) is a solution of the following linearquadratic problem Therefore, the functions r * (·), c * (·) are maximizers of |Q(T )| independently of the functions a(·).

HJBI Equation
For our case, in [Subbotin, 1999, Theorem 9.1] where h (t, a, b, M, N Here the inner product ⟨A, B⟩ means traceA ′ B. If the function c(t, M, N ) has been built, the optimal strategies of first and second players are defined as selectors of inclusions t, a, b, M, N, Dc(t, M, N ) a, b, M, N, Dc(t, M, N )).
It is known that the solution of (19) in minimax sense coincides with viscosity solution (see [Subbotin, 1999;Fleming, Soner, 2006]). Note that both the solutions are unique.

A Numerical Solution
A numerical procedure can be built on the base of [Souganidis, 1985;Souganidis, 1999;Taras'ev et. al., 2006]. Here we do not perform a decomposition and consider initial equation (14) on the interval [t 0 , T ], t 0 > 0. Denote by F (t, Q, a, b) the right-hand side of this equation. Let us establish some properties of solutions of equation (14). By K n α,β we denote the segment of nonnegative-defined and simmetrical matrices The matrix A(t) will be considered Lipschitzean in t.
Lemma 2. Let assumption 4 hold. Then solutions of equation (14), the mappings F (t, Q, a, b) and the final payoff s(Q), possess the following properties.
R1. For any instant t 0 ∈ (0, T ) there exist positive constants α, β that do not depend on control parameters and such that

such a way that it is bounded and the uniform Lipschitz condition
holds. R3. The function s(Q), Q ∈ K n α,β , may be continued on the space R n×n in such a way that it is bounded and the Lipschitz condition |s(Q 1 ) − s(Q 2 )| ≤ C 2 |Q 1 − Q 2 | holds.
The lemma may be proved with the help of the Kirszbraun theorem (see [Federer, 1969, Theorem 2.10.43]) about the continuation of Lipschitzean maps.
Henceforth, we believe that the mappings F and s are continued due to lemma 2. The Hamiltonian is now defined as H(t, Q, S) = max b∈B min a∈A h(t, a, b, Q, S), where h(t, a, b, Q, S) = ⟨ S, F (t, Q, a, b) ⟩ . The function c : [t 0 , T ] × R n×n → R (the value of the game) satisfies in corresponding formalization the HJBI equation Here the symbol Dc ∈ R n×n means the gradient of c(t, Q) and is the matrix. For approximation of c(t, Q), we consider the partition , and i ∈ 0 : N (∆).
Using [Souganidis, 1999, Theorem 4.4], we obtain We can suggest the following numerical algorithm.
1. Choose a finite set (a grid) is a collection of positive-definite matrices of small norm. The set must be contained in the segment K n 0,β , which uniformly covers the attainability domains of the Riccati equation.

Form and remember the function c
) and corresponding optimal controls b * N and a * N (b). 4. On subsequent steps the grid function is formed: Q, a, b) ) and corresponding optimal controls b Q, a, b) does not lie in the grid, then this value is changed for the nearest element from N . 5. The value c 0 (Q) gives an approximate value of the game.
Example 5. Consider the system of example 3 and suppose that t 0 = 1, T = 10. Equation (21)  By method based on above algorithm, we get the value c 0 (Q(1)) = 1.104 that close to the value in example 3. Note that the solutions of the Riccati equation are fast stabilized here to the steady-state solution under all control parameters.
Example 6. Let us optimize the estimation for the two-dimensional oscillating systeṁ with δI 2 ≤ c 1 (t)c ′ 1 (t) ≤ I 2 . Here the case when the assumption 5 holds, i.e. the 2-nd player chooses c 1 (t)c ′ 1 (t) = I 2 and b 1 (t) ≡ 1. Therefore, we obtain the Riccati equation in the formṖ (t) = −A ′ P (t) − P (t)A + G 2 (t) because the observer chooses a 1 (t) ≡ 0. The equation has the explicit solution, and we have where the matrix X(t) = [cos t, sin t; − sin t, cos t].
Here the functional |P −1 (T )| is concave in variable function a(·). Hence, the approximate optimal solution is a piecewise constant function with values in {0, 1}.
In the class of constant functions the minimal value of the functional equals π −1 .

Conclusion
The problem of observations' control for non-stationary linear systems is considered. The quality of observation is measured by the diameter of information set at the end of time interval or the maximal deviation of this set from the origin. The problem is reduced to a differential game for the Riccati equations, where the first part of parameters is chosen by the first player (an observer) and the second part is chosen by the second player (an opponent) who tries to worsen the quality of observation. In the common case, there are a saddle point in the class 'counterstrategy/strategy'. The value of the game may be found by integration of corresponding HJBI equation, the solution of which is understood in a generalized sense. The optimal strategies are also defined due to this solution. The numerical approximation is specified and the estimation of the rate of convergence is given for the approximating scheme. Particular cases of the equations with constant coefficients are considered, and the solutions for steady-state regimes of the Riccati equation are given. The examples are considered as well.