On the General Theory of Optimal Processes
R. V. GAMKRELIDZE
Submitted 1958-01-01 | SovietRxiv: ru-195801.43817 | Translated from Russian

Abstract Generated abstract

This note gives a brief proof of Pontryagin’s maximum principle for a general optimal control problem in which a trajectory of a controlled differential system joins two prescribed points while minimizing an integral functional. The argument treats admissible controls as bounded measurable mappings into an arbitrary Hausdorff topological space and uses needle-like variations, variational equations, and supporting hyperplanes to convex cones of attainable first variations. It derives the existence of an adjoint vector satisfying the Hamiltonian system and shows that, along an optimal trajectory, the Hamiltonian attains its maximum over the control space and is identically zero under the stated normalization. The result extends the earlier time-optimal formulation to a broader class of integral optimal processes.

Full Text

MATHEMATICS

R. V. GAMKRELIDZE

ON THE GENERAL THEORY OF OPTIMAL PROCESSES

(Presented by Academician I. N. Vekua on 23 VI 1958)

In the paper (¹) there is set forth the maximum principle, formulated by L. S. Pontryagin, for systems optimal with respect to speed. The present note contains a brief proof of the general maximum principle, when in the optimal system an arbitrary functional of the form (3) is minimized. The proof is based on the method of variations and convex sets, apparently first applied by McShane (²) and substantially improved by L. S. Pontryagin, which made it possible to consider as control functions arbitrary bounded measurable functions with values in any topological Hausdorff space. By this same method V. G. Boltyanskii had earlier proved the maximum principle formulated in (¹) (see (³)).*

1°. Statement of the problem. Let \(\Omega\) be a topological Hausdorff space. By the class of admissible controls we shall mean the set of all measurable and bounded mappings into \(\Omega\) of an arbitrary interval \(T_1 \leq t \leq T_2\), where \(T_1, T_2\) are arbitrary real numbers. (We shall call a mapping \(u(t)\) of the interval \(T_1 \leq t \leq T_2\) into the space \(\Omega\) measurable and bounded if the inverse image of any open set in \(\Omega\) is measurable on the \(t\)-axis and the closure of the image of the interval \(T_1 \leq t \leq T_2\) in \(\Omega\) is compact.)

Let the real functions
\[ f^i(x^1,\ldots,x^n,u)=f^i(\bar{x},u),\quad i=0,\ldots,n, \]
be continuous on the direct product \(X^n\cdot \Omega\), where \(X^n\) is the \(n\)-dimensional real linear space of the variables \(x^1,\ldots,x^n\), and continuously differentiable with respect to all variables \(x^1,\ldots,x^n\).

The equations of motion of the representative point \(x=(x^0,\ldots,x^n)\) in the \((n+1)\)-dimensional phase space \(X^{n+1}\) have the normal form
\[ \dot{x}^i=f^i(x^1,\ldots,x^n,u)=f^i(\bar{x},u),\quad i=0,\ldots,n, \tag{1} \]
where, for denoting vectors from \(X^{n+1}\), letters without a bar are used, and for denoting vectors from the subspace \(X^n\), letters with a bar. In \(X^n\) two points \(\bar{\xi}_1,\bar{\xi}_2\) are given; by \(\pi\) we denote the straight line in \(X^{n+1}\) containing the point \((0,\bar{\xi}_2)\) and parallel to the axis \(x^0\). It is required, in the class of admissible controls, to choose an optimal control \(u(t)\), \(T_1 \leq t \leq T_2\), for which the corresponding optimal trajectory \(x(t)\) of the system (1) is defined on the whole interval \(T_1 \leq t \leq T_2\), satisfies the initial condition \(x(T_1)=(0,\bar{\xi}_1)\), the end \(x(T_2)\) of the trajectory lies on the straight line \(\pi\), and the coordinate \(x^0(T_2)\) assumes a minimum.

Consequently, to the optimal control \(u(t)\), \(T_1 \leq t \leq T_2\), there corresponds a trajectory \(\bar{x}(t)=(x^1(t),\ldots,x^n(t))\) of the system

* The results of the note were obtained in L. S. Pontryagin’s seminar on the theory of oscillations and automatic control.

\[ \dot{x}^{i}=f^{i}(x^{1},\ldots,x^{n},u)=f^{i}(\bar{x},u),\quad i=1,\ldots,n \tag{2} \]

joining the points \(\bar{\xi}_{1},\bar{\xi}_{2}\), such that the integral

\[ x^{0}=\int_{T_{1}}^{T_{2}} f^{0}(\bar{x}(t),u(t))\,dt \tag{3} \]

attains a minimum.

For \(f^{0}(\bar{x},u)\equiv 1\) the formulated problem becomes the optimal work problem (1).

2°. Definition of varied controls and trajectories. Let \(t_{1}\) be an interior point of the interval \(T_{1}\leqslant t\leqslant T_{2}\). We shall say that a given set of \(s\) intervals of lengths \(\varepsilon\sigma_{1},\ldots,\varepsilon\sigma_{s}\), where \(\sigma_{1},\ldots,\sigma_{s}\) are nonnegative numbers and the positive quantity \(\varepsilon\to 0\), is attached to the point \(t_{1}\), if the intervals can be numbered in such a way that the right endpoint of the \(i\)-th interval coincides with the left endpoint of the \((i+1)\)-st, \(i=1,\ldots,s-1\), and the right endpoint of the \(s\)-th interval coincides with \(t_{1}\).

We define the class of controls obtained by variation of a given admissible control \(u(t)\), \(T_{1}\leqslant t\leqslant T_{2}\). An admissible control \(v(t)\), \(T_{1}\leqslant t\leqslant T_{2}\), belongs to the class being defined if it is constructed as follows. Let \(t_{1},\ldots,t_{k}\) be interior points of the interval \(T_{1}\leqslant t\leqslant T_{2}\), and let all of them be Lebesgue points of the vector function
\[ f(\bar{x}(t),u(t))=(f^{0}(\bar{x}(t),u(t)),\ldots,f^{n}(\bar{x}(t),u(t)), \]
\(T_{1}\leqslant t\leqslant T_{2}\), where \(x(t)=(x^{0}(t),\bar{x}(t))\) is a trajectory of the system (1) corresponding to the control \(u(t)\) and satisfying the initial condition \(x(T_{1})=(0,\bar{\xi}_{1})\). We shall call the points \(t_{1},\ldots,t_{k}\) the determining points of the control being constructed. Suppose that to the point \(t_{j}\), \(j=1,\ldots,k\), there is attached a system of \(s_{j}\) intervals \(I_{j\alpha_{j}}\), whose lengths are equal to \(\varepsilon\sigma_{j\alpha_{j}}\), \(\alpha_{j}=1,\ldots,s_{j}\); we shall call these intervals the determining intervals of the control being constructed, attached to the point \(t_{j}\). Finally, let \(v_{j\alpha_{j}}\) be arbitrary fixed points of the space \(\Omega\); we shall call the point \(v_{j\alpha_{j}}\) the determining value of the control being constructed, corresponding to the determining interval \(I_{j\alpha_{j}}\). We now define the varied control \(v(t)=v(t,\varepsilon)\), \(T_{1}\leqslant t\leqslant T_{2}\), by the condition: \(v(t,\varepsilon)=u(t)\) outside the determining intervals, \(v(t,\varepsilon)=v_{j\alpha_{j}}\) on the interval \(I_{j\alpha_{j}}\).

The product \(\lambda v\), where \(\lambda\) is a nonnegative number, is defined as a varied control whose determining points coincide with the determining points of the control \(v\), the lengths of the determining intervals \(J_{j\alpha_{j}}\), \(\alpha_{j}=1,\ldots,s_{j}\), attached to the point \(t_{j}\), are equal to \(\lambda\varepsilon\sigma_{j\alpha_{j}}\), while the corresponding determining values remain the same as for the control \(v\).

We define the sum \(v_{1}+v_{2}\) of two varied controls \(v_{1},v_{2}\) as a control of the same class whose determining points are obtained by taking the union of the determining points of the summand controls; to each of these points, without changing the lengths, all determining intervals of both summands attached to the point under consideration are attached; and, finally, to each determining interval of the sum control taken from the summand control \(v_{i}\), there corresponds in the sum control the same determining value as in the summand control \(v_{i}\). The controls \(\lambda v\), \(v_{1}+v_{2}\) defined by the enumerated conditions are specified up to the order in which the determining intervals are attached to the given determining point. However, this order will have no influence on the subsequent conclusions.

Denote by \(y(t,\varepsilon)\) the varied trajectory of the system (1) corresponding to the control \(v(t,\varepsilon)\) and satisfying the initial condition \(y(T_{1},\varepsilon)\equiv(0,\bar{\xi}_{1})\). For sufficiently small \(\varepsilon\) the trajectory \(y(t,\varepsilon)\) is defined on the whole interval \(T_{1}\leqslant t\leqslant T_{2}\) and can be represented on this interval in the form

\[ y(t,\varepsilon)=x(t)+\eta(t,\varepsilon)+o(\varepsilon)h(t,\varepsilon), \]

where \(h(T_{1},\varepsilon)\equiv \eta(T_{1},\varepsilon)\equiv 0\), the vector function \(h(t,\varepsilon)\), \(T_{1}\leqslant t\leqslant T_{2}\), is bounded,

and the vector function \(\eta(t,\varepsilon)=(\eta^0,\ldots,\eta^n)\) is defined by the condition: \(\eta(t,\varepsilon)\) is absolutely continuous on the interval \(T_1\leq t\leq T_2\), outside the determining intervals of the control \(v(t,\varepsilon)\) satisfies the system of equations in variations

\[ \dot{\eta}^i=\sum_{\alpha=0}^{n}\frac{\partial f^i(\bar{x}(t),u(t))}{\partial x^\alpha}\eta^\alpha,\qquad i=0,\ldots,n, \tag{4} \]

and on the determining interval \(I_{j\alpha_j}\)

\[ \eta(t,\varepsilon)=\eta(\tau_{j\alpha_j})+ \bigl[f(\bar{x}(t_j),v_{j\alpha_j})-f(\bar{x}(t_j),u(t_j))\bigr](t-\tau_{j\alpha_j}), \]

where \(\tau_{j\alpha_j}\) is the left end of the interval \(I_{j\alpha_j}\).

Let \(\tau\) be an arbitrary Lebesgue point of the vector function \(f(\bar{x}(t),u(t))\) from the interval \(T_1<t<T_2\), not coinciding with any of the determining points of the control \(v(t,\varepsilon)\); let \(\rho\) be an arbitrary real number. By the variation of the trajectory \(x(t)\) at the point \(\tau\), corresponding to the control \(v(t,\varepsilon)\) and the number \(\rho\), we shall mean the vector

\[ \delta x(\tau,\rho,v)=\lim_{\varepsilon\to 0}\frac{y(\tau+\rho\varepsilon,\varepsilon)-x(\tau)}{\varepsilon} =\lim_{\varepsilon\to 0}\frac{\eta(\tau,\varepsilon)}{\varepsilon} +\rho f(\bar{x}(\tau),u(\tau)). \]

The points of the form \(x(\tau)+\delta x(\tau,\rho,v)\), where \(\tau\) is fixed and \(\rho, v(t,\varepsilon)\) vary arbitrarily, form a convex cone \(K(\tau)\) with vertex at the point \(x(\tau)\).

Let \(\psi=(\psi_0,\ldots,\psi_n)\) be an \((n+1)\)-dimensional covariant vector of the space \(X^{n+1}\). Introduce the function

\[ H(\bar{x},\psi,u)=\sum_{\alpha=0}^{n}\psi_\alpha f^\alpha(\bar{x},u)=\psi\cdot f(\bar{x},u). \]

By \(M(\bar{x},\psi)\) we denote the exact upper bound of the function \(H(\bar{x},\psi,u)\) for fixed \(\bar{x},\psi\) and for \(u\) varying over the whole space \(\Omega\).

3°. Lemma. Let \(u(t)\), \(T_1\leq t\leq T_2\), be an optimal control taking the image point of the phase space \(X^{n+1}\) along the optimal trajectory \(x(t)\) of system (1) from the position \(x(T_1)=(0,\bar{\xi}_1)\) to the position \(x(T_2)=(x^0(T_2),\bar{\xi}_2)\). Then at every point \(\theta\) of the interval \(T_1\leq t\leq T_2\) that is a Lebesgue point of the vector function \(f(\bar{x}(t),u(t))\), the equality \(H(\bar{x}(\theta),\psi(\theta),u(\theta))=M(\bar{x}(\theta),\psi(\theta))\) holds, where the absolutely continuous vector function \(\psi(t)=(\psi_0,\ldots,\psi_n)\) satisfies on the interval \(T_1\leq t\leq T_2\) the system

\[ \dot{\psi}_i=-\sum_{\alpha=0}^{n}\frac{\partial f^\alpha(\bar{x}(t),u(t))}{\partial x^i}\psi_\alpha,\qquad i=0,\ldots,n. \tag{5} \]

Since \(f(\bar{x},u)\) does not depend on \(x^0\), \(\psi_0=\mathrm{const}\).

Proof. Since the trajectory \(x(t)\) is optimal, no point of the ray \(l(\tau)\subset X^{n+1}\), issuing from the point \(x(\tau)\) and directed along the negative \(x^0\)-axis, is an interior point for the convex cone \(K(\tau)\). Consequently, there exists a supporting hyperplane \(P(\tau)\) to the cone \(K(\tau)\), passing through the vertex \(x(\tau)\) and either separating the ray \(l(\tau)\) from \(K(\tau)\) or containing it. By \(\chi=(\chi_0,\ldots,\chi_n)\) denote the covariant vector orthogonal to \(P(\tau)\) and directed in such a way that, for an arbitrary variation \(\delta x(\tau,\rho,v)\), the inequalities

\[ \chi\cdot\delta x =\chi\cdot\lim_{\varepsilon\to 0}\frac{\eta(\tau,\varepsilon)}{\varepsilon} +\rho\chi\cdot f(\bar{x}(\tau),u(\tau))\leq 0,\qquad \chi_0\leq 0 \tag{6} \]

are satisfied. Hence, in an obvious way, follows the equality

\[ \chi\cdot f(\bar{x}(\tau),u(\tau))=0. \tag{7} \]

Denote by \(\psi(t,\chi)\), \(T_1\leq t\leq T_2\), the solution of system (5) satisfying the condition \(\psi(\tau,\chi)=\chi\). From inequality (6) it follows that at each point

Lebesgue point \(\theta\) of the vector-function \(f(\bar x(t), u(t))\), \(T_1 \leq \theta < \tau\), we have
\(H(\bar x(\theta), \psi(\theta), u(\theta)) = M(\bar x(\theta), \psi(\theta))\).
Indeed, if this equality is violated, then there is a point \(v_1 \in \Omega\) such that
\(\psi(\theta,\chi)\cdot [f(\bar x(\theta), v_1)-f(\bar x(\theta), u(\theta))]=a>0\).
We then define a varied control by taking the point \(\theta\) as its defining point; we attach to \(\theta\) a single defining interval of length \(\varepsilon\) and take \(v_1\) as the defining value on it. We obtain the inequality
\(\psi(\theta,\chi)\cdot \eta(\theta,\varepsilon) =\psi(\theta,\chi)\cdot [f(\bar x(\theta), v_1)-f(\bar x(\theta), u(\theta))]=a\varepsilon>0\).
On the interval \(\theta \leq t \leq \tau\) the product \(\psi\cdot\eta\), by virtue of systems (4), (5), is constant. Hence
\(\psi(\tau,\chi)\cdot\eta(\tau,\varepsilon)=\chi\cdot\eta(\tau,\varepsilon)=a\varepsilon>0\), which contradicts inequality (6).

The proof of the lemma is easily completed by passing to the limit \(\tau\to T_2\).

\(4^\circ\). Study of the function \(H(\bar x(t), \psi(t), u(t))\). Denote by \(N\) the closure of the image of the interval \(T_1 \leq t \leq T_2\) in \(\Omega\) under the mapping \(u(t)\). Denote the quantity
\(\sup\limits_{u\in N} H(\bar x,\psi,u)\) by \(m(\bar x,\psi)\). Since the set \(N\) is compact by assumption, there exists a function \(u_1(t)\), \(T_1 \leq t \leq T_2\), coinciding with \(u(t)\) at every Lebesgue point of the vector-function \(f(\bar x(t),u(t))\), \(T_1 \leq t \leq T_2\), and satisfying at every point \(t\) of the interval \(T_1 \leq t \leq T_2\) the condition

\[ H(t)=H(\bar x(t),\psi(t),u_1(t))=m(\bar x(t),\psi(t)). \tag{8} \]

The control \(u_1(t)\) is optimal and to it there corresponds the same optimal trajectory \(x(t)\) as to the control \(u(t)\). From equality (8) there follow the inequalities
\(H(t+\delta t)-H(\bar x(t),\psi(t),u_1(t+\delta t)) \geq H(t+\delta t)-H(t) \geq H(\bar x(t+\delta t), \psi(t+\delta t), u_1(t))-H(t)\), from which the absolute continuity of the function \(H(t)\) on the interval \(T_1 \leq t \leq T_2\) is easily derived. Finally, from systems (4), (5) it follows that almost everywhere on this interval \(dH(t)/dt=0\); consequently, taking equality (7) into account, we obtain
\(H(\bar x(t),\psi(t),u_1(t)) \equiv M(\bar x(t),\psi(t)) \equiv 0\), \(T_1 \leq t \leq T_2\).

Combining the content of the lemma with the results of the present point, we arrive at the following general principle.

\(5^\circ\). L. S. Pontryagin’s maximum principle for optimal processes. If the \(2n\)-dimensional vector
\((\bar x(t),\psi(t))=(x^1,\ldots,x^n;\psi_1,\ldots,\psi_n)\)
satisfies the Hamiltonian system

\[ \dot x^i=f^i(\bar x,u_1(t))=\frac{\partial H}{\partial \psi_i},\qquad \dot\psi_i=-\sum_{\alpha=0}^{n}\frac{\partial f^\alpha(\bar x,u_1(t))}{\partial x^i}\,\psi_\alpha =-\frac{\partial H}{\partial x^i}, \]

\[ i=1,\ldots,n, \]

where the bounded measurable function \(u_1(t)\) at each instant of time \(t\) satisfies the condition
\(H(\bar x(t),\psi(t),u_1(t))=M(\bar x(t),\psi(t))=0\)
and \(\psi_0=\mathrm{const}\leq 0\), then we shall call \(u_1(t)\) an extremal control, and \(\bar x(t)\) the corresponding extremal trajectory of system (2). Every optimal trajectory corresponding to an optimal control \(u(t)\) is also an extremal trajectory corresponding to some extremal control \(u_1(t)\); almost everywhere \(\dot u(t)=u_1(t)\).

Steklov Mathematical Institute
Academy of Sciences of the USSR

Received
12 VI 1958

CITED LITERATURE

  1. V. G. Boltyanskii, R. V. Gamkrelidze, L. S. Pontryagin, DAN, 110, No. 1, 7 (1956).
  2. McShane, Ann. Math., 41, 314 (1940).
  3. V. G. Boltyanskii, DAN, 119, No. 6 (1958).

Submission history

On the General Theory of Optimal Processes