Full Text
Preamble
DIFFERENTIAL EQUATIONS, 1967, VOL. III, NO. 3
Synthesis of Optimal Controls in Nonlinear Second-Order Oscillatory Systems
Consider a controlled object whose motion is described by a second-order differential equation with a one-dimensional control domain:
$$-1 \le u \le 1.$$
The introduction of variables reduces the equation to a normal second-order system:
$$\begin{aligned} \dot{x}_1 &= x_2, \\ \dot{x}_2 &= f(x_1, x_2, u). \end{aligned}$$
We assume that the function $f$ is continuous and continuously differentiable with respect to $x_1, x_2, u$, and twice continuously differentiable with respect to $x_1, x_2$. Furthermore, it satisfies the following conditions:
$$f(0, 0, 1) > 0, \quad f(0, 0, -1) < 0, \quad \frac{\partial f(x_1, x_2, u)}{\partial u} > 0 \text{ for all } x_1, x_2, u;$$
$$\frac{\partial f(x_1, x_2, u)}{\partial x_1} < 0 \text{ for all } x_1, x_2, u;$$
$$\left( \frac{\partial^2 f}{\partial x_1 \partial x_2} \right)^2 - \frac{\partial^2 f}{\partial x_1^2} \frac{\partial^2 f}{\partial x_2^2} < 0, \quad \frac{\partial^2 f}{\partial x_1^2} + \frac{\partial^2 f}{\partial x_2^2} \le 0$$
for $u = \pm 1$ and all $x_1, x_2$. Condition (7) implies that for $u = \pm 1$, the quadratic form takes only non-positive values. These conditions are satisfied, for example, by the linear object $\ddot{x} + x = u$ (see \cite{1}, pp. 34–43), as well as by nonlinear objects differing from it by a "small" convex addition. Specifically, let $\phi(x_1, x_2, u)$ be an arbitrary function convex in $x_1, x_2$ (i.e., satisfying condition (7)) with bounded first derivatives $\frac{\partial \phi}{\partial x_1}, \frac{\partial \phi}{\partial x_2}$. Then, setting
$$f(x_1, x_2, u) = -x_1 + u + \mu \phi(x_1, x_2, u),$$
we obtain for sufficiently small $\mu$ an object (3) satisfying all conditions (4)–(7). Additionally, if $f$ takes the form $f(x_1, x_2, u) = -x_1 + g(x_2, u)$, where $g$ satisfies $g(0, 1) > 0, g(0, -1) < 0, \frac{\partial g}{\partial u} > 0, \frac{\partial^2 g}{\partial x_2^2} \le 0$, then all conditions (4)–(7) are also fulfilled (see \cite{2}, p. 510).
For the object described by relations (2)–(7), we consider the problem of time-optimal arrival at the origin. As we shall see, the synthesis of optimal controls for this nonlinear object is qualitatively the same as for the linear object $\ddot{x} + x = u$; that is, the optimal trajectories approach the origin as spirals (see Fig. 13, p. 41 of \cite{1} and Fig. 10 below). Interestingly, this "oscillatory" character of the optimal trajectories is linked to condition (6), which can be termed the condition of "strong negativity" of the derivative $\frac{\partial f}{\partial x_1}$. As shown in \cite{3}, replacing this condition with the condition of non-negativity of the derivative leads to each optimal control having only one switching, and the synthesis pattern resembles that of the linear object $\ddot{x} = u$ (see \cite{1}, pp. 29–34, particularly Fig. 7).
We now proceed to solve the time-optimal problem for object (2)–(7). First, we find all phase trajectories satisfying the maximum principle (\cite{1}, p. 26). The Hamiltonian $H$ for this object is:
$$H = \psi_1 x_2 + \psi_2 f(x_1, x_2, u).$$
The system of equations for the adjoint variables is:
$$\begin{aligned} \dot{\psi}_1 &= -\psi_2 \frac{\partial f}{\partial x_1}, \\ \dot{\psi}_2 &= -\psi_1 - \psi_2 \frac{\partial f}{\partial x_2}. \end{aligned}$$
The maximum condition (along an optimal trajectory, $H$ reaches its maximum with respect to $u$) implies that the expression $\psi_2 f(x_1, x_2, u)$ reaches its maximum. Since $\frac{\partial f}{\partial u} > 0$, the function $f$ is monotonically increasing in $u$, thus:
$$u = \text{sign } \psi_2. \quad (10)$$
To understand the behavior of $\psi_2$, we examine the law of change for the angle $\alpha = \arctan(\psi_1 / \psi_2)$. According to (9), we have:
$$\frac{d}{dt} \arctan \frac{\psi_1}{\psi_2} = \frac{\dot{\psi}_1 \psi_2 - \psi_1 \dot{\psi}_2}{\psi_1^2 + \psi_2^2} = \frac{\psi_1^2 + \psi_1 \psi_2 \frac{\partial f}{\partial x_2} - \psi_2^2 \frac{\partial f}{\partial x_1}}{\psi_1^2 + \psi_2^2}.$$
The discriminant of the quadratic form in the numerator is $(\frac{\partial f}{\partial x_2})^2 + 4 \frac{\partial f}{\partial x_1}$. According to (6), this discriminant is negative, which is precisely why condition (6) was formulated. Thus, the numerator maintains a constant sign. It is clear that it remains positive (setting $\psi_2 = 0$ yields $\psi_1^2 > 0$), meaning $\arctan(\psi_1 / \psi_2)$ monotonically increases, or equivalently, the vector $\{\psi_1(t), \psi_2(t)\}$ rotates clockwise.
Furthermore, for $u = \text{const}$, we have:
$$\frac{d}{dt} \arctan \frac{x_2}{x_1} = \frac{\dot{x}_2 x_1 - x_2 \dot{x}_1}{x_1^2 + x_2^2} = \frac{x_1 f(x_1, x_2, u) - x_2^2}{x_1^2 + x_2^2}.$$
According to (7), the quadratic form in the numerator maintains a negative sign for $u = \text{const}$. In other words, on each segment of the solution where $u = \text{const}$, the phase velocity vector $\{x_1(t), x_2(t)\}$ rotates clockwise.
Let $x_0$ be an arbitrary point in the phase plane where $x_0 \neq 0$. Consider the solution $x(t), \psi(t)$ of the system (3), (9) for $u = 1$, satisfying the initial conditions:
$$x_1(0) = x_{1,0}, \quad x_2(0) = x_{2,0}, \quad \psi_1(0) = 1, \quad \psi_2(0) = 0. \quad (13)$$
From (9) and (13), we find $\dot{\psi}_2(0) = -1$. Consequently, $\psi_2(t) > 0$ for small negative $t$. Let $\tau_+(x_0)$ be the negative root of $\psi_2(t)$ closest to zero. On the interval $\tau_+ < t < 0$, the function $\psi_2$ remains positive. We denote the segment of the trajectory $x(t)$ corresponding to this time interval as $K_+(x_0)$. This segment ends at $x_0$; its starting point is denoted by $\xi_+(x_0)$. Along this segment, $\psi_2(t) > 0$ and $u = 1$, satisfying the maximum principle. Since $f(x_1, x_2, 1) \neq 0$, the point $x_0$ is not an equilibrium, and the phase velocity vector is non-zero along the entire arc.
Similarly, using initial conditions $\psi_1(0) = -1, \psi_2(0) = 0$, we find a solution for $u = -1$. Let $\tau_-(x_0)$ be the negative root of $\psi_2(t)$ closest to zero. The corresponding trajectory segment is $K_-(x_0)$, with the starting point denoted by $\xi_-(x_0)$. This segment also satisfies the maximum principle.
We now prove that if $x_0$ lies on the $x_1$-axis, then both points $\xi_+(x_0)$ and $\xi_-(x_0)$ also lie on the $x_1$-axis. In this case, $K_+(x_0)$ is a convex arc located entirely on one side of the $x_1$-axis, with tangents at its ends parallel to the $x_2$-axis (Fig. 1a). If $x_0$ does not lie on the $x_1$-axis, then $\xi_+(x_0)$ and $\xi_-(x_0)$ lie on opposite sides of the axis, and the arc intersects the $x_1$-axis exactly once (Fig. 1b). This follows from the fact that the vector $\psi(t)$ rotates clockwise, and at two consecutive zeros of $\psi_2(t)$, the values of $\psi_1(t)$ must have opposite signs.
The convexity of these arcs follows from the unidirectional rotation of the tangent vector. The fact that the arc $K_+(x_0)$ can intersect the $x_1$-axis at most once is proven by contradiction: if it intersected twice, the phase velocity vector would be parallel to the $x_1$-axis at two points, which would contradict the property that $\psi_2(t)$ has no zeros between $\tau_+$ and $0$.
Next, we construct the trajectories arriving at the origin that satisfy the maximum principle. Starting from the origin $o$, we construct the arc $K_+(o)$. This arc lies entirely below the $x_1$-axis, and its starting point $\zeta_1$ lies on the $x_1$-axis (Fig. 4). For any point $z$ on $K_+(o)$, we can define an arc $K_-(z)$ ending at $z$. Let the starting point of this arc be $\zeta_2$. Continuing this process, we construct a sequence of arcs $K_+(o), K_-(\zeta_1), K_+(\zeta_2), \dots$, forming a trajectory $\eta_+(\zeta_0)$. Similarly, we construct $\eta_-(\zeta_1)$ starting with $K_-(o)$. All such trajectories satisfy the maximum principle and consist of alternating segments where $u = +1$ and $u = -1$.
The topological structure of these trajectories in the phase plane forms a series of curvilinear quadrilaterals $Q_i^-, Q_i^+$. Each trajectory, except those forming the boundaries of these quadrilaterals, enters a quadrilateral once and exits it at a non-zero angle. The switching line consists of the union of the arcs $K_+(o), K_-(o)$ and their subsequent mappings.
Conclusion
The final result for the time-optimal control of the object (2)–(7) is as follows: the optimal trajectories arriving at the origin are spirals consisting of alternating segments corresponding to $u = +1$ and $u = -1$. The switching line consists of the constructed arcs, and the synthesis of optimal controls is realized by a function $u(x_1, x_2)$ that takes the value $-1$ above the switching line and $+1$ below it. The qualitative behavior of the optimal trajectories is shown in Fig. 10.
References
- Pontryagin, L. S., Boltyanskii, V. G., Gamkrelidze, R. V., Mishchenko, E. F. The Mathematical Theory of Optimal Processes. Fizmatgiz, 1961.
- Boltyanskii, V. G. Izv. AN SSSR, Ser. Mat., 28, no. 3, 481–514, 1964.
- Boltyanskii, V. G., Roitenberg, E. Ya. Kibernetika, 52–56, 1966.
- Boltyanskii, V. G. Mathematical Methods of Optimal Control. "Nauka", 1966.
Received by the Editorial Board September 1966. V.A. Steklov Mathematical Institute.