Full Text
On the Existence of the Optimal Lyapunov Function and of a Continuous Optimal Control for a Certain Problem in the Analytic Construction of Regulators
E. G. Albrecht
The problem of the analytic construction of an optimal regulator \(u[x,t]\) for certain nonlinear systems under large initial disturbances is considered. The existence of continuous solutions \(v[x,t]\) and \(u[x,t]\) of the Lyapunov–Bellman equation is proved in the case when the optimal control action \(u_{x^0,t_0}(t)\), considered as a function of time, is unique for each initial point \(x^0\) of the domain of admissible disturbances.
§ 1. Statement of the Problem and Preliminary Remarks
Let the behavior of the controlled system be described by the differential equations of the disturbed motion
\[ \frac{dx_i}{dt}=\varphi_i(x,t)+\psi_i(x,t)u \quad (i=1,\ldots,n). \tag{1.1} \]
Here \(x=\{x_1,\ldots,x_n\}\) is the \(n\)-dimensional vector of phase coordinates of the system; \(u\) is the control action.
Consider the following four problems:
Problem 1.1. It is required to find a control \(u[x,t]\) such that, along the trajectories of system (1.1), the functional
\[ I[x^0,t_0,u]=\int_{t_0}^{T} G(x(t),u(t),t)\,dt+f(x(T)), \tag{1.2} \]
attains a minimum, where \(G(x,u,t)\) is a positive-definite function of the variables \(x_i\) and \(u\). In this article we restrict ourselves to the case
\[ G(x,u,t)=g(x,t)+g_1(x,t)u+u^2. \tag{1.3} \]
Problem 1.1a. It is required to find a control \(u[x,t]\), constrained by the condition \(|u[x,t]|\leq 1\), under which, along the trajectories of system (1.1), the functional (1.2), (1.3) attains a minimum.
Problem 1.2. It is required to find a control \(u[x,t]\) such that the unperturbed motion \(x=0\) is asymptotically stable and, along the trajectories of system (1.1), the functional
\[ I[x^0,t_0,u]=\int_{t_0}^{\infty} G(x(t),u(t),t)\,dt, \tag{1.4} \]
attains a minimum.
where \(G(x,u,t)\) is a positive definite function in \(x_i\) and \(u\), defined by (1.3).
Problem 1.2a. It is required to find a control \(u[x,t]\), constrained by the condition \(|u[x,t]| \leqslant 1\), under which the unperturbed motion \(x=0\) is asymptotically stable and, along the trajectories of system (1.1), the functional (1.4), (1.3) attains its minimum.
Remark 1.1. In discussing Problems 1.1, 1.1a we shall assume that \(\varphi_i(x,t)\), \(\psi_i(x,t)\), \(g(x,t)\), \(g_1(x,t)\) are continuously differentiable functions of the variables \(x_1,\ldots,x_n\) in some bounded domain \(H\), containing the origin \(x=0\), and continuous functions of time \(t\) on the interval \([0,T]\); \(f(x(T))\) is a continuously differentiable function in the domain \(H\). In the case of Problem 1.1, in addition to these conditions we shall assume that the functions \(\varphi_i(x,t)\), \(g(x,t)\) are defined in the domain
\[ 0\leqslant t\leqslant T,\qquad -\infty < x_1,\ldots,x_n < +\infty \]
and are continuous in \(t\) and \(x_1,\ldots,x_n\). Moreover,
\[ |\varphi_i(x,t)| \leqslant M \]
and, for any value of \(\bar{x}\),
\[ |\varphi_i(\bar{x},t)-\varphi_i(x(t),t)| \leqslant N\sum_{i=1}^{n}|\bar{x}_i-x_i(t)| \]
\[ (i=1,\ldots,n), \]
where \(M\) and \(N\) are certain constants, and \(x(t)\) is a solution of the system \(dx/dt=\varphi(x,t)\) under the initial condition \(x(t_0)=x^0\), \(0\leqslant t_0\leqslant T\).
In discussing Problems 1.2, 1.2a we shall assume that \(\varphi_i(x,t)\), \(\psi_i(x,t)\), \(g(x,t)\), \(g_1(x,t)\) are analytic in \(x_i\) in the domain \(H\) and continuous and bounded functions of time \(t\geqslant 0\).
Remark 1.2. We shall require fulfillment of the conditions of Problems 1.1–1.2a for initial perturbations \(x^0\) from a closed domain \(H_0\subset H\). If \(H\) is the whole space, then \(H_0\) is also the whole space.
Problems of the form 1.1–1.2a have been considered in a number of works (see, for example, [1–10]) and have been studied in detail in the case of linear systems [2, 4–9]. For completeness of exposition, we give sufficient criteria for optimality of the control [6–10].
Criterion 1.1. If one can specify functions \(v[x,t]\) and \(u^0[x,t]\) satisfying the conditions:
1)
\[ (dv/dt)_{u^0[x,t]}=-G(x,u^0[x,t],t), \]
where the symbol \((dv/dt)_{u^0[x,t]}\) denotes the total derivative with respect to time of the function \(v[x,t]\) by virtue of equations (1.1) for \(u=u^0[x,t]\);
2)
\[ v(x(T),T)=f(x(T)); \]
3) the function
\[ \Phi(x,u,t)=(dv/dt)_u+G(x,u,t) \]
has a minimum with respect to the admissible values of \(u\) at each point \(x\) of the domain \(H\) for all \(0\leqslant t\leqslant T\) when the function \(u=u^0[x,t]\) is substituted, then the controlling action \(u=u^0[x,t]\) will be an optimal control in the sense of Problem 1.1 (or 1.1a) for all \(x^0\) from the domain \(H_0\), where
\[ \sup [\,v[x,t]\ \text{on the boundary of } H_0\ \text{for } 0\leqslant t_0\leqslant t\leqslant T\,] < \]
\[ < \inf [\,v[x,t]\ \text{on the boundary of } H\ \text{for } 0\leqslant t_0\leqslant t\leqslant T\,]. \]
Criterion 1.2. If it is possible to specify functions \(v[x,t]\) and \(u^0[x,t]\) satisfying the following conditions:
1) the function \(v[x,t]\) satisfies the conditions of Lyapunov’s theorem on asymptotic stability [11—13];
2)
\[
\left(\frac{dv}{dt}\right)_{u^0[x,t]}=-G(x,u^0[x,t],t);
\]
3) the function
\[
\Phi(x,u,t)=\left(\frac{dv}{dt}\right)_u+G(x,u,t)
\]
has a minimum with respect to the admissible values of \(u\) at every point \(x\in H\) and for all \(t\geq 0\) when the function \(u=u^0[x,t]\) is substituted, then the control action \(u=u^0[x,t]\) will be an optimal control in the sense of problem 1.2 (or 1.2a) for all \(x^0\) from \(H_0\), where
\[
\sup\,[v[x,t]\ \text{on the boundary } H_0 \text{ for } 0\leq t_0\leq t<\infty]<
\]
\[
<\inf\,[v[x,t]\ \text{on the boundary } H \text{ for } 0\leq t_0\leq t<\infty].
\]
Thus, in order to solve problems 1.1—1.2a it is sufficient to find functions \(v[x,t]\) and \(u^0[x,t]\) satisfying the equation
\[
\min_u\left\{\left(\frac{dv}{dt}\right)_u+G(x,u,t)\right\}=0
\tag{1.5}
\]
under the boundary conditions stipulated in Criteria 1.1, 1.2. In solving problems 1.1a and 1.2a, the minimum in (1.5) should be determined taking into account the additional constraint \(|u|\leq 1\), which narrows the class of admissible controls. Criteria 1.1 and 1.2 combine the ideas of Lyapunov’s function method [11—13] and Bellman’s dynamic programming method [14]. Therefore, the function \(v[x,t]\) satisfying the conditions of Criterion 1.2 will be called an optimal Lyapunov function, and equation (1.5) will be called the Lyapunov—Bellman equation. From the maximum principle [1] it follows that equation (1.5) plays the same role in the theory of optimal systems as the Hamilton—Jacobi equation does in classical analytical mechanics.
The aim of the present paper is to show that, under certain conditions, there exists a continuously differentiable function \(v[x,t]\) satisfying the conditions of Criterion 1.1 in the case of problems 1.1 and 1.1a, and a continuously differentiable optimal Lyapunov function satisfying the conditions of Criterion 1.2 in the case of problems 1.2 and 1.2a, and that, consequently, there exists a continuous solution of the problem of the analytical construction of a regulator—\(u[x,t]\). This problem corresponds to the problem of the converse of Lyapunov’s theorems in the theory of stability of motion [13], with the difference that the optimal Lyapunov function satisfies additional conditions connected with the specific features of the formulation of problems 1.1 and 1.2 on optimal control.
We shall assume that for each point \(x^0\) in some domain of initial disturbances \(H_0\) there exists a unique optimal (in the sense of problems 1.1—1.2a) control \(u_{x^0,t_0}(t)\), i.e., from each point of the domain of initial disturbances \(H_0\) there emerges only one trajectory of system (1.1) along which the functional (1.2) (or (1.4)) attains its minimum for \(u=u_{x^0t_0}(t)\). We shall also assume that all trajectories of system (1.1) issuing from \(H_0\) and generated by optimal controls are entirely contained in the domain \(H\).
Remark 1.3. For simplicity it is assumed in the paper that \(u\) is a scalar function; however, all the arguments can be carried over to the case where \(u\) is a vector function.
§ 2. Auxiliary results
Let \(x^0\) and \(\bar{x}^0\) be arbitrary points of \(H_0\) \((0 \leq t_0 \leq T)\), the distance between which is sufficiently small. Consider a convergent sequence of points
\[
\bar{x}^0,\ \bar{x}^0_{(1)},\ \ldots,\ \bar{x}^0_{(k)},\ \ldots \to x^0 .
\tag{2.1}
\]
To this sequence of points there corresponds a sequence of optimal controls
\[
u_{\bar{x}^0,t_0}(t),\quad
u_{\bar{x}^0_{(1)},t_0}(t),\ \ldots,\
u_{\bar{x}^0_{(k)},t_0}(t),\ \ldots
\tag{2.2}
\]
and a sequence of optimal trajectories of the system (1.1)
\[
x(\bar{x}^0,t_0,t),\quad
x(\bar{x}^0_{(1)},t_0,t),\ \ldots,\
x(\bar{x}^0_{(k)},t_0,t),\ \ldots .
\tag{2.3}
\]
Lemma 2.1. If for every point \(x^0\) of the region of initial perturbations \(H_0\) there exists an optimal control \(u_{x^0,t_0}(t)\) in the sense of Problem 1.1, then the set of optimal trajectories (2.3) is uniformly continuous.
Proof. From the conditions of Problem 1.1 it follows that
\[
\int_{t_0}^{T} u^2_{x^0,t_0}(t)\,dt
\leq
\int_{t_0}^{T} G\bigl(x(x^0,t_0,t),\,u_{x^0,t_0}(t),\,t\bigr)\,dt
\leq
\]
\[
\leq
\int_{t_0}^{T} g\bigl(x_{u=0}(x^0,t_0,t),\,t\bigr)\,dt,
\]
where \(x_{u=0}(x^0,t_0,t)\) is the solution of the system (1.1) under the initial condition \(t=t_0,\ x=x^0\) and with \(u=0\). Then from the assumptions (see Remark 1.1) and Carathéodory’s theorem [15] it follows that
\[
\int_{t_0}^{T} u^2_{x^0,t_0}(t)\,dt \leq N_1^2
\quad \text{for } x_0 \in H_0,
\tag{2.4}
\]
i.e. the functions \(u_{x^0,t_0}(t)\) may be regarded as elements of the functional space \(L_2\) ([16], p. 70), lying in the ball \(\|u\|\leq N_1\). Write the system (1.1) in integral form:
\[
x(t)=\bar{x}^0+\int_{t_0}^{t}
\bigl[\varphi(x(\tau),\tau)+\psi(x(\tau),\tau)u_{\bar{x}^0,t_0}(\tau)\bigr]\,d\tau .
\tag{2.5}
\]
Then
\[
\Delta x=x(t+\Delta t)-x(t)=
\]
\[
=\int_{t}^{t+\Delta t}
\bigl[\varphi(x(\tau),\tau)+\psi(x(\tau),\tau)u_{\bar{x}^0,t_0}(\tau)\bigr]\,d\tau ,
\tag{2.6}
\]
\[
\overline{\Delta x}=
\int_{t}^{t+\Delta t}
\psi(x(\tau),\tau)u_{\bar{x}^0,t_0}(\tau)\,d\tau .
\tag{2.7}
\]
To prove the lemma it suffices to show that \(\|\overline{\Delta x}\|\to 0\) as \(\Delta t\to 0\), uniformly with respect to the set of functions (2.3).
For each function \(u_{x^0,t_0}(t)\), expression (2.7) may be regarded as a continuous linear operator ([16], p. 96) \(T_u[h_{t,\Delta t}(\tau)]\), defined on the set of \(h(\tau)\) from \(L_2[0,T]\) and, in particular, on the set of functions of the form
\[ h_{t,\Delta t}(\tau)= \begin{cases} \psi(x(\tau),\tau), & t\leq \tau \leq t+\Delta t,\\ 0, & t_0\leq \tau < t,\quad t+\Delta t<\tau \leq T. \end{cases} \]
Thus we have a family of operators \(T_u\), depending on \(u\). From (2.4), (2.7) it follows that
\[ \|T_u[h]\|\leq N_1\|h_{t,\Delta t}\| \]
and, moreover, \(\|h_{t,\Delta t}(\tau)\|\to 0\) as \(\Delta t\to 0\). Therefore, by the principle of uniform boundedness ([17], p. 65), it follows that
\[ \lim_{\Delta t\to 0} T_u[h_{t,\Delta t}(\tau)]=0 \]
uniformly with respect to every function \(h_{t,\Delta t}(\tau)\) and \(u\), which proves the lemma.
Remark 2.1. In the case of problem 1.1a, Lemma 2.1 is automatically satisfied, since from \(|u_{x^0,t_0}(t)|\leq 1\) follows (2.4), and then from the boundedness of the functions \(\varphi_i(x,t)\), \(\psi_i(x,t)\), \(u_{x^0,t_0}(t)\), \(x(x^0,t_0,t)\) follows the equicontinuity of the set of functions (2.3).
Lemma 2.2. If the optimal control in the sense of problem 1.1 (1.1a) is unique for each point of the domain of initial perturbations \(H_0\), then the sequence of optimal controls (2.2) has a weak limit \(u^*(t)\), and this limit is the optimal control corresponding to the point \(x^0\), i.e. \(u^*(t)=u_{x^0,t_0}(t)\), while the sequence of optimal trajectories converges uniformly to the curve \(x(x^0,t_0,t)\).
Proof. The set of functions (2.3) is uniformly bounded, since, by assumption, all trajectories of system (1.1) emanating from \(H_0\) are entirely contained in \(H\), and it is equicontinuous by Lemma 2.1. Hence, by the Arzelà–Ascoli theorem ([16], p. 29), the set (2.3) is compact in the space \(C\). Therefore from (2.3) one may extract a subsequence converging uniformly to a curve \(x^{**}(x_0,t_0,t)\). From the sequence of optimal controls (2.2) we extract a weakly convergent subsequence
\[ u_{x^0_{(1)},t_0}(t),\ldots,u_{x^0_{(k)},t_0}(t),\ldots \rightharpoonup u^*(t). \tag{2.8} \]
This is possible by virtue of (2.4) and the weak compactness of \(L_2\). Let the curve \(x^{**}(x^0,t_0,t)\) be generated by the function \(u^*(t)\) under the initial condition \(t=t_0\), \(x^*(t_0)=x^0\). We shall show that \(x^{**}(t)\equiv x^*(t)\). To this end suppose the contrary: \(x^{**}(t)\ne x^*(t)\), and in (2.5) pass to the limit as \(\overline{x^0}\to x^0\). We obtain
\[ x^{**}(t)=x^0+\int_{t_0}^{t}\varphi(x^{**}(\tau),\tau)\,d\tau +\lim_{\overline{x^0}\to x^0}\int_{t_0}^{t}\psi(x(\tau),\tau)u_{\overline{x^0},t_0}(\tau)\,d\tau . \]
Using the continuity of the function \(\psi(x,t)\) and the definition of weak convergence, we compute the limit
\[ \lim_{\bar x^0 \to x^0}\int_{t_0}^{t}\psi(x(\tau),\tau)u_{\bar x^0,t_0}(\tau)\,d\tau = \lim_{\bar x^0 \to x^0}\int_{t_0}^{t}\left[\psi(x(\tau),\tau)u_{\bar x^0,t_0}(\tau)- \right. \]
\[ \left. -\psi(x^{**}(\tau),\tau)u_{\bar x^0,t_0}(\tau) +\psi(x^{**}(\tau),\tau)u_{\bar x^0,t_0}(\tau)\right]d\tau = \]
\[ = \lim_{\bar x^0 \to x^0}\int_{t_0}^{t}\psi(x^{**}(\tau),\tau)u_{\bar x^0,t_0}(\tau)d\tau = \int_{t_0}^{t}\psi(x^{**}(\tau),\tau)u^*(\tau)d\tau . \]
Thus, we obtain
\[ x^{**}(t)=x^0+\int_{t_0}^{t}\left[\varphi(x^{**}(\tau),\tau)+\psi(x^{**}(\tau),\tau)u^*(\tau)\right]d\tau, \]
i.e., from the point \(x^0\), for \(u=u^*(\tau)\), two trajectories issue, which is impossible by virtue of the uniqueness of the solution of the differential equations (1.1). Consequently, the supposition is false and \(x^{**}(t)=x^*(t)\).
We now show that \(u^*(t)\) is the optimal control corresponding to the point \(x^0\). To this end suppose, contrary to the assertion of the lemma, that there exists a control \(u^0(t)\) such that the inequality
\[ \int_{t_0}^{T}G(x_{u^0}(x^0,t_0,t),u^0(t),t)\,dt+f(x_{u^0}(T))< \]
\[ <\int_{t_0}^{T}G(x^*(x^0,t_0,t),u^*(t),t)\,dt+f(x^*(T)), \tag{2.9} \]
holds, where \(x_{u^0}(x^0,t_0,t)\) is the trajectory generated by \(u^0(t)\) under the initial condition \(t=t_0\), \(x_{u^0}(x^0,t_0,t_0)=x^0\). Since \(u_{x^0_{(k)},t_0}(t)\) is the optimal control corresponding to the point \(x^0_{(k)}\), the inequality
\[ \int_{t_0}^{T}G(x_{u^0}(x^0_{(k)},t_0,t),u^0(t),t)\,dt+f(x_{u^0}(x^0_{(k)},t_0,T)) \geq \]
\[ \geq \int_{t_0}^{T}G(x(x^0_{(k)},t_0,t),u_{x^0_{(k)},t_0}(t),t)\,dt +f(x(x^0_{(k)},t_0,T)). \]
Taking into account the arguments given above, let us pass in this relation to the limit as \(\bar x^0 \to x^0\), and we arrive at a contradiction with the supposition (2.9). Consequently, \(u^*(t)\) is an optimal control, and by virtue of the uniqueness of the control \(u^*(t)=u_{x^0,t_0}(t)\).
Thus, we have shown that from the sequence of optimal controls (2.2) one can extract a subsequence weakly convergent to the optimal control \(u_{x^0,t_0}(t)\), and from the sequence of solutions (2.3) one can extract a subsequence uniformly convergent to the curve \(x(x^0,t_0,t)\). But then it can be verified that, from the uniqueness of the solution of system (1.1), from the uniqueness of the control, from the definition of weak convergence, and from the arguments given above, the validity of the lemma follows.
Consider the following systems of differential equations:
\[ \frac{dx_i^{(1)}}{dt}=\varphi_i(x^{(1)},t)+\psi_i(x^{(1)},t)u_{x^0,t_0}(t)\quad (i=1,\ldots,n), \tag{2.10} \]
\[ \frac{d x_i^{(2)}}{d t}=\varphi_i(x^{(2)},t)+\psi_i(x^{(2)},t)u_{\bar{x}^0,t_0}(t)\quad (i=1,\ldots,n), \tag{2.11} \]
where \(u_{x^0,t_0}(t)\), \(u_{\bar{x}^0,t_0}(t)\) are fixed functions, optimal controls corresponding to the points \(x^0\) and \(\bar{x}^0\). The solution of system (2.10) under the initial condition
\[
x_i^{(1)}(t_0)=x_i^0+\xi_i=\xi_i^{(1)}
\]
will be denoted by the symbol
\[
x_i^{(1)}(t)=x_i^{(1)}(\xi^{(1)},t_0,t),
\]
and the solution of system (2.11) by the symbol
\[
x_i^{(2)}(t)=x_i^{(2)}(\xi^{(2)},t_0,t),
\]
and
\[
x_i^{(2)}(t_0)=\bar{x}^0+\xi_i=\xi_i^{(2)}.
\]
Here \(\xi_i\) are certain small parameters determining the initial conditions for the solutions \(x_i^{(1)}(t)\), \(x_i^{(2)}(t)\).
Lemma 2.3. If the optimal control \(u_{x^0,t_0}(t)\) (in the sense of problems 1.1 and 1.1a) is unique at every point of the domain of initial perturbations, then the relation
\[ \lim_{\bar{x}^0\to x^0} \left( \frac{\partial x_i^{(1)}}{\partial \xi_j^{(1)}}- \frac{\partial x_i^{(2)}}{\partial \xi_j^{(2)}} \right)=0 \tag{2.12} \]
\[ (i,j=1,\ldots,n),\qquad (0\leq t_0\leq t\leq T) \]
holds uniformly with respect to \(\xi\) for \(|\xi_j|<\alpha\), where \(\alpha\) is a sufficiently small positive number.
Proof. The derivatives of the solutions with respect to the initial data satisfy systems of linear homogeneous equations ([19], p. 301):
\[ \frac{d z_i^{(1)}}{d t} = \sum_{s=1}^{n} \left[ \frac{\partial \varphi_i(x^{(1)}(t),t)}{\partial x_s^{(1)}} + \frac{\partial \psi_i(x^{(1)}(t),t)}{\partial x_s^{(1)}} u_{x^0,t_0}(t) \right]z_s^{(1)}, \tag{2.13} \]
\[ \frac{d z_i^{(2)}}{d t} = \sum_{s=1}^{n} \left[ \frac{\partial \varphi_i(x^{(2)}(t),t)}{\partial x_s^{(2)}} + \frac{\partial \psi_i(x^{(2)}(t),t)}{\partial x_s^{(2)}} u_{\bar{x}^0,t_0}(t) \right]z_s^{(2)}, \tag{2.14} \]
where
\[ z_i^{(1)}=\frac{\partial x_i^{(1)}}{\partial \xi_j^{(1)}},\qquad z_i^{(2)}=\frac{\partial x_i^{(2)}}{\partial \xi_j^{(2)}} \quad (i,j=1,\ldots,n). \]
The solution of systems (2.13), (2.14) under the initial conditions
\[ t=t_0,\qquad z_j^{(l)}(t_0)=1,\quad z_i^{(l)}(t_0)=0 \quad \text{for } i\ne j, \tag{2.15} \]
\[ (l=1,2),\quad (j=1,\ldots,n) \]
determines the groups of derivatives:
\[ \frac{\partial x_1^{(1)}}{\partial \xi_j^{(1)}},\ldots, \frac{\partial x_n^{(1)}}{\partial \xi_j^{(1)}} \quad (j=1,\ldots,n), \]
\[ \frac{\partial x_1^{(2)}}{\partial \xi_j^{(2)}},\ldots, \frac{\partial x_n^{(2)}}{\partial \xi_j^{(2)}} \quad (j=1,\ldots,n). \]
To prove the lemma it is sufficient to verify that the solution of system (2.14), under one of the initial conditions (2.15), converges uniformly as \(\bar{x}^0\to x^0\) to the corresponding solution of system (2.13). For this, consider the sequence of equations (2.14) corresponding to the sequence of points (2.1), and the sequence of solutions of these equations under one of the initial conditions (2.15). This sequence
solutions is uniformly bounded and equicontinuous; therefore from it one can extract a subsequence converging uniformly to some function \(z^*(t)\). Using the uniqueness of the solution of system (2.13) and the results of Lemmas 2.1, 2.2, one can, in the same way as above in the proof of Lemma 2.2, show that \(z^*(t)=z^{(1)}(t)\). But then the entire sequence of solutions will converge uniformly to the solution \(z^{(1)}(t)\) of system (2.13) (see Lemma 2.2).
§ 3. Main results
Theorem 3.1. If for each point of the domain of initial perturbations \(H_0\) there exists a unique optimal control in the sense of problem 1.1, then there exists a continuously differentiable function \(v[x,t]\) satisfying the conditions of criterion 1.1.
Proof. From the condition of criterion 1.1 it follows that
\[ v[x^0,t]=\int_t^T \left[g\bigl(x(x^0,t,\tau),\tau\bigr)+g_1\bigl(x(x^0,t,\tau),\tau\bigr)u_{x^0,t}(\tau)+u_{x^0,t}^2(\tau)\right]d\tau+f(x(T)). \tag{3.1} \]
To prove the theorem it is sufficient to show that the function (3.1) has continuous partial derivatives with respect to the variables \(x_i\) ([13], p. 297; [20], p. 380) for all \(0\le t\le T\). Using the concept of the variational derivative ([21], p. 33) and taking into account that the variation of (3.1) with respect to \(u\) is equal to zero, we obtain
\[ \frac{\delta v}{\delta x_j^0} = \int_t^T \left\{ \sum_{i=1}^n \left[ \frac{\partial g}{\partial x_i}\cdot\frac{\partial x_i}{\partial x_j^0} + \frac{\partial g_1}{\partial x_i}\cdot\frac{\partial x_i}{\partial x_j^0}\,u_{x^0,t}(\tau) \right] \right\} d\tau + \sum_{i=1}^n \frac{\partial f}{\partial x_i}\cdot\frac{\partial x_i}{\partial x_j^0} \quad (j=1,\ldots,n). \]
From Lemmas 2.1 and 2.2 follows the continuity of the partial derivatives \(\delta v/\delta x_j^0\), which proves the theorem.
Theorem 3.2. If for each point of the domain of initial perturbations \(H_0\) there exists a unique optimal control in the sense of problem 1.1a, then there exists a continuously differentiable function \(v[x,t]\) satisfying the conditions of criterion 1.1.
Proof. Consider the function
\[ v_1(\xi^{(1)},t) = \int_t^T \left[ g\bigl(x^{(1)}(\xi^{(1)},t,\tau),\tau\bigr) + g_1\bigl(x^{(1)}(\xi^{(1)},t,\tau),\tau\bigr)u_{x^0,t}(\tau) + u_{x^0,t}^2(\tau) \right]d\tau + f(x^{(1)}(T)). \tag{3.2} \]
This function exists, is continuous, and has continuous derivatives with respect to \(\xi_j^{(1)}\) ([19], p. 299; [22], p. 665)
\[ \frac{\partial v_1(\xi^{(1)},t)}{\partial \xi_j^{(1)}} = \int_t^T \left\{ \sum_{i=1}^n \left[ \frac{\partial g}{\partial x_i^{(1)}}\cdot\frac{\partial x_i^{(1)}}{\partial \xi_j^{(1)}} + \frac{\partial g_1}{\partial x_i^{(1)}}\cdot\frac{\partial x_i^{(1)}}{\partial \xi_j^{(1)}}\cdot u_{x^0,t}(\tau) \right] \right\} d\tau + \]
\[ +\sum_{i=1}^{n}\frac{\partial f}{\partial x_i^{(1)}}\cdot \frac{\partial x_i^{(1)}}{\partial \xi_j^{(1)}} . \tag{3.3} \]
For \(\xi_i=0\), from (3.3) we obtain the values of the derivatives of the function \(v_1(\xi^{(1)},t)\) at the point \(x^0\), which determine the tangent plane to the surface (3.2) at this point for any \(0\le t_0\le t\le T\)
\[ \sum_{j=1}^{n}\left(\frac{\partial v_1}{\partial \xi_j^{(1)}}\right)_{\xi=0} (y_j-x_j^0)=0, \tag{3.4} \]
where \(y_j\) is the current coordinate of a point of the tangent plane.
We shall show that the following equalities hold:
\[ \lim_{\Delta x_j^0\to 0} \frac{v[x^0+\Delta x^0,t]-v[x^0,t]}{\Delta x_j^0} = \left(\frac{\partial v_1}{\partial \xi_j^{(1)}}\right)_{\xi=0}, \tag{3.5} \]
i.e., the function (3.1), computed under variable control, and the function (3.2), computed under fixed control, have at the point \(x^0\) a common tangent plane.
Indeed, two cases may arise. First, in a small neighborhood of the point \(x^0\), the distance between the points of the surface (3.1) and the points of the tangent plane (3.4) will be of second order of smallness in comparison with \(\xi\). In this case the function \(v[x^0,t]\) is differentiable at the point \(x^0\), and the relations (3.5) hold. Second, in a small neighborhood of the point \(x^0\), in the direction toward the point \(\bar{x}^0=x^0+\Delta x^0\), the distance between the points of the surface \(v[x^0,t]\) and the plane (3.4) will be of first order of smallness in comparison with \(\xi\). In this case, by virtue of the optimality of the control \(u_{x^0,t}(\tau)\), the points of the surface \(v[x^0,t]\) will lie below the tangent plane (3.4). We shall show that this case is impossible. To this end, let us fix the control corresponding to the point \(\bar{x}^0\), and construct at this point the tangent plane to the function
\[ v_2(\xi^{(2)},t)= \int_t^T \bigl[ g(x^{(2)}(\xi^{(2)},t,\tau),\tau) + g_1(x^{(2)}(\xi^{(2)},t,\tau),\tau)\times \]
\[ \times u_{\bar{x}^0,t}(\tau)+u_{\bar{x}^0,t}^{\,2}(\tau) \bigr]\,d\tau + f(x^{(2)}(T)). \tag{3.6} \]
This plane is determined by the relations
\[ \sum_{j=1}^{n}\left(\frac{\partial v_2}{\partial \xi_j^{(2)}}\right)_{\xi=0} (y_j-\bar{x}_j^0)=0, \tag{3.7} \]
where
\[ \frac{\partial v_2}{\partial \xi_j^{(2)}}= \int_t^T \left\{ \sum_{i=1}^{n} \left[ \frac{\partial g}{\partial x_i^{(2)}}\cdot \frac{\partial x_i^{(2)}}{\partial \xi_j^{(2)}} + \frac{\partial g_1}{\partial x_i^{(2)}} \frac{\partial x_i^{(2)}}{\partial \xi_j^{(2)}} \,u_{\bar{x}^0,t}(\tau) \right] \right\}\,d\tau + \]
\[ +\sum_{i=1}^{n}\frac{\partial f}{\partial x_i^{(2)}}\cdot \frac{\partial x_i^{(2)}}{\partial \xi_j^{(2)}} . \]
It follows from Lemmas 2.2 and 2.3 that, as \(\bar{x}^0\to x^0\), the tangent plane (3.7) continuously passes into the tangent plane (3.4). Therefore, when \(\Delta x^0\) is sufficiently small, the value of the function \(v_2(\xi^{(2)},t)\) at the point \(x^0\)
will be less than \(v[x^0,t]\) by a quantity of the first order of smallness, which contradicts the optimality of the control \(u_{x^0,t}(\tau)\). This means that the second case is impossible and the function \(v[x,t]\) has continuous partial derivatives with respect to \(x_i\).
Theorem 3.3. If, for every point of the domain of initial disturbances \(H_0\), there exists a unique optimal control in the sense of problem 1.2 (or 1.2a), then there exists a continuously differentiable Lyapunov function satisfying the conditions of criterion 1.2.
Proof. From the conditions of the theorem it follows that
\[ v[x^0,t]=\int_t^\infty G\bigl(x(x^0,t,\tau),\,u_{x^0,t}(\tau),\,\tau\bigr)\,d\tau \leq N_2,\qquad x_0\in H_0, \]
where \(N_2\) is a finite positive number.
From this inequality and from the equicontinuity of the solutions \(x(x_0,t_0,t)\) of system (1.1), it follows that the time instant \(t=T\) can be chosen so that, for all \(t\geq T\), any solution of system (1.1) with \(x^0\in H_0\) lies in a sufficiently small neighborhood of the point \(x=0\). Then one can write
\[ v[x^0,t]=\int_t^T G\bigl(x(x^0,t,\tau),\,u_{x^0,t}(\tau),\,\tau\bigr)\,d\tau+v[x(T),T]. \]
The first term in this expression is continuously differentiable by Theorems 3.1 and 3.2. The existence of continuous partial derivatives of the second term follows from the results of works [23–25].
Theorem 3.4. If, for every point of the domain of initial disturbances \(H_0\), there exists a unique optimal control (in the sense of problems 1.1–1.2a), then there exists a continuous solution of the problem of analytic construction of a regulator.
Proof. From the Lyapunov–Bellman equation (1.5) it follows that
\[ u[x,t]=-\frac{1}{2}\sum_{i=1}^n \psi_i(x,t)\frac{\partial v}{\partial x_i} -\frac{1}{2}g_1(x,t). \]
Then the validity of Theorem 3.4 follows from the continuity of the functions \(\psi_i(x,t)\) and \(g_1(x,t)\) and from Theorems 3.1–3.3.
Remark 3.1. The requirement of uniqueness of the optimal control for every point of the domain of initial disturbances is essential. Consider the example
\[ \frac{dx}{dt}=-x+\mu y^2,\qquad \frac{dy}{dt}=u, \]
\[ I(u)=\int_0^\infty \bigl(x^2(t)+y^2(t)+u^2(t)\bigr)\,dt, \]
where \(\mu>0\) is a certain parameter. It can be shown that, for large initial disturbances \(x\) and sufficiently large values of \(\mu\), the optimal Lyapunov function is not differentiable at points lying on the negative semiaxis of abscissas, and from these points there issue two optimal trajectories.
The author expresses deep gratitude to N. N. Krasovskii for the formulation of the problem and for his comments.
References
-
Pontryagin L. S., Boltyanskii V. G., Gamkrelidze R. V., Mishchenko E. F. The Mathematical Theory of Optimal Processes. Fizmatgiz, 1961.
-
Letov A. M. Analytic construction of regulators. Automation and Remote Control, vol. XXI, Nos. 4–6, 1960; vol. XXII, No. 4, 1961.
-
Bellman R., Kalaba R. The theory of dynamic programming and control systems with feedback. Report at the First International IFAC Congress, 1960.
-
Kalman R. E. On the general theory of control systems. Report at the First International IFAC Congress, 1960.
-
Kurzweil J. On the analytic construction of regulators. Automation and Remote Control, vol. XXII, issue 2, 1961.
-
Krasovskii N. N. On a pursuit problem. PMM, vol. XXVI, issue 2, 1962.
-
Krasovskii N. N., Letov A. M. On the theory of analytic construction of regulators. Automation and Remote Control, vol. XXIII, No. 6, 1962.
-
Krasovskii N. N. On the stabilization of unstable motions by additional forces. PMM, vol. XXVII, issue 4, 1963.
-
Kirillova F. M. On the problem of analytic construction of regulators. PMM, vol. XXV, issue 3, 1961.
-
Albrecht E. G. On optimal stabilization of nonlinear systems. PMM, vol. XXV, issue 5, 1961.
-
Lyapunov A. M. The General Problem of the Stability of Motion. Moscow–Leningrad, Gostekhizdat, 1950.
-
Chetaev N. G. Stability of Motion. GITTL, 1956.
-
Malkin I. G. Theory of Stability of Motion. GITTL, 1955.
-
Bellman R. Dynamic Programming. Moscow, IL, 1960.
-
Sansone D. Ordinary Differential Equations, vol. 1. IL, 1953.
-
Kantorovich L. V., Akilov G. P. Functional Analysis in Normed Spaces. Fizmatgiz, 1959.
-
Dunford N., Schwartz J. T. Linear Operators. Moscow, IL, 1962.
-
Nemytskii V. V., Stepanov V. V. Qualitative Theory of Differential Equations. GITTL, 1949.
-
Stepanov V. V. A Course of Differential Equations. Fizmatgiz, 1959.
-
Fikhtengolts G. M. A Course of Differential and Integral Calculus, vol. 1. Fizmatgiz, 1962.
-
Gelfand I. M., Fomin S. V. Calculus of Variations. Fizmatgiz, 1961.
-
Fikhtengolts G. M. A Course of Differential and Integral Calculus, vol. II. Fizmatgiz, 1962.
-
Zubov V. I. On the theory of analytic construction of regulators. Automation and Remote Control, vol. XXIV, No. 8, 1963.
-
Albrecht E. G. On the theory of analytic construction of regulators. Proceedings of the Interuniversity Conference on the Applied Theory of Stability of Motion and Analytical Mechanics. Kazan, 1962.
-
Albrecht E. G. Optimal stabilization of nonlinear systems. Mathematical Notes of the Ural Mathematical Society, vol. IV, fascicle 2, 1963.
Received by the editors
March 25, 1965
Ural State University
named after A. M. Gorky