The synthesis of optimal controls in nonlinear oscillatory systems of second order
V. G. Boltyanskii, G. Nasritdinov
Submitted 1967-01-01 | SovietRxiv: ru-196701.84818 | Translated from Russian

Full Text

Preamble

DIFFERENTIAL EQUATIONS, 1967, VOL. III, NO. 3

Synthesis of Optimal Controls in Nonlinear Second-Order Oscillatory Systems

Consider a controlled object whose motion is described by a second-order differential equation with a one-dimensional control domain:
$$-1 \le u \le 1.$$
The introduction of variables reduces the equation to a normal second-order system:
$$\begin{aligned} \dot{x}_1 &= x_2, \\ \dot{x}_2 &= f(x_1, x_2, u). \end{aligned}$$
We assume that the function $f$ is continuous and continuously differentiable with respect to $x_1, x_2, u$, and twice continuously differentiable with respect to $x_1, x_2$. Furthermore, it satisfies the following conditions:
$$f(0, 0, 1) > 0, \quad f(0, 0, -1) < 0, \quad \frac{\partial f(x_1, x_2, u)}{\partial u} > 0 \text{ for all } x_1, x_2, u;$$
$$\frac{\partial f(x_1, x_2, u)}{\partial x_1} < 0 \text{ for all } x_1, x_2, u;$$
$$\left( \frac{\partial^2 f}{\partial x_1 \partial x_2} \right)^2 - \frac{\partial^2 f}{\partial x_1^2} \frac{\partial^2 f}{\partial x_2^2} < 0, \quad \frac{\partial^2 f}{\partial x_1^2} + \frac{\partial^2 f}{\partial x_2^2} \le 0$$
for $u = \pm 1$ and all $x_1, x_2$. Condition (7) implies that for $u = \pm 1$, the quadratic form takes only non-positive values. These conditions are satisfied, for example, by the linear object $\ddot{x} + x = u$ (see \cite{1}, pp. 34–43), as well as by nonlinear objects differing from it by a "small" convex addition. Specifically, let $\phi(x_1, x_2, u)$ be an arbitrary function convex in $x_1, x_2$ (i.e., satisfying condition (7)) with bounded first derivatives $\frac{\partial \phi}{\partial x_1}, \frac{\partial \phi}{\partial x_2}$. Then, setting
$$f(x_1, x_2, u) = -x_1 + u + \mu \phi(x_1, x_2, u),$$
we obtain for sufficiently small $\mu$ an object (3) satisfying all conditions (4)–(7). Additionally, if $f$ takes the form $f(x_1, x_2, u) = -x_1 + g(x_2, u)$, where $g$ satisfies $g(0, 1) > 0, g(0, -1) < 0, \frac{\partial g}{\partial u} > 0, \frac{\partial^2 g}{\partial x_2^2} \le 0$, then all conditions (4)–(7) are also fulfilled (see \cite{2}, p. 510).

For the object described by relations (2)–(7), we consider the problem of time-optimal arrival at the origin. As we shall see, the synthesis of optimal controls for this nonlinear object is qualitatively the same as for the linear object $\ddot{x} + x = u$; that is, the optimal trajectories approach the origin as spirals (see Fig. 13, p. 41 of \cite{1} and Fig. 10 below). Interestingly, this "oscillatory" character of the optimal trajectories is linked to condition (6), which can be termed the condition of "strong negativity" of the derivative $\frac{\partial f}{\partial x_1}$. As shown in \cite{3}, replacing this condition with the condition of non-negativity of the derivative leads to each optimal control having only one switching, and the synthesis pattern resembles that of the linear object $\ddot{x} = u$ (see \cite{1}, pp. 29–34, particularly Fig. 7).

We now proceed to solve the time-optimal problem for object (2)–(7). First, we find all phase trajectories satisfying the maximum principle (\cite{1}, p. 26). The Hamiltonian $H$ for this object is:
$$H = \psi_1 x_2 + \psi_2 f(x_1, x_2, u).$$
The system of equations for the adjoint variables is:
$$\begin{aligned} \dot{\psi}_1 &= -\psi_2 \frac{\partial f}{\partial x_1}, \\ \dot{\psi}_2 &= -\psi_1 - \psi_2 \frac{\partial f}{\partial x_2}. \end{aligned}$$
The maximum condition (along an optimal trajectory, $H$ reaches its maximum with respect to $u$) implies that the expression $\psi_2 f(x_1, x_2, u)$ reaches its maximum. Since $\frac{\partial f}{\partial u} > 0$, the function $f$ is monotonically increasing in $u$, thus:
$$u = \text{sign } \psi_2. \quad (10)$$
To understand the behavior of $\psi_2$, we examine the law of change for the angle $\alpha = \arctan(\psi_1 / \psi_2)$. According to (9), we have:
$$\frac{d}{dt} \arctan \frac{\psi_1}{\psi_2} = \frac{\dot{\psi}_1 \psi_2 - \psi_1 \dot{\psi}_2}{\psi_1^2 + \psi_2^2} = \frac{\psi_1^2 + \psi_1 \psi_2 \frac{\partial f}{\partial x_2} - \psi_2^2 \frac{\partial f}{\partial x_1}}{\psi_1^2 + \psi_2^2}.$$
The discriminant of the quadratic form in the numerator is $(\frac{\partial f}{\partial x_2})^2 + 4 \frac{\partial f}{\partial x_1}$. According to (6), this discriminant is negative, which is precisely why condition (6) was formulated. Thus, the numerator maintains a constant sign. It is clear that it remains positive (setting $\psi_2 = 0$ yields $\psi_1^2 > 0$), meaning $\arctan(\psi_1 / \psi_2)$ monotonically increases, or equivalently, the vector $\{\psi_1(t), \psi_2(t)\}$ rotates clockwise.

Furthermore, for $u = \text{const}$, we have:
$$\frac{d}{dt} \arctan \frac{x_2}{x_1} = \frac{\dot{x}_2 x_1 - x_2 \dot{x}_1}{x_1^2 + x_2^2} = \frac{x_1 f(x_1, x_2, u) - x_2^2}{x_1^2 + x_2^2}.$$
According to (7), the quadratic form in the numerator maintains a negative sign for $u = \text{const}$. In other words, on each segment of the solution where $u = \text{const}$, the phase velocity vector $\{x_1(t), x_2(t)\}$ rotates clockwise.

Let $x_0$ be an arbitrary point in the phase plane where $x_0 \neq 0$. Consider the solution $x(t), \psi(t)$ of the system (3), (9) for $u = 1$, satisfying the initial conditions:
$$x_1(0) = x_{1,0}, \quad x_2(0) = x_{2,0}, \quad \psi_1(0) = 1, \quad \psi_2(0) = 0. \quad (13)$$
From (9) and (13), we find $\dot{\psi}_2(0) = -1$. Consequently, $\psi_2(t) > 0$ for small negative $t$. Let $\tau_+(x_0)$ be the negative root of $\psi_2(t)$ closest to zero. On the interval $\tau_+ < t < 0$, the function $\psi_2$ remains positive. We denote the segment of the trajectory $x(t)$ corresponding to this time interval as $K_+(x_0)$. This segment ends at $x_0$; its starting point is denoted by $\xi_+(x_0)$. Along this segment, $\psi_2(t) > 0$ and $u = 1$, satisfying the maximum principle. Since $f(x_1, x_2, 1) \neq 0$, the point $x_0$ is not an equilibrium, and the phase velocity vector is non-zero along the entire arc.

Similarly, using initial conditions $\psi_1(0) = -1, \psi_2(0) = 0$, we find a solution for $u = -1$. Let $\tau_-(x_0)$ be the negative root of $\psi_2(t)$ closest to zero. The corresponding trajectory segment is $K_-(x_0)$, with the starting point denoted by $\xi_-(x_0)$. This segment also satisfies the maximum principle.

We now prove that if $x_0$ lies on the $x_1$-axis, then both points $\xi_+(x_0)$ and $\xi_-(x_0)$ also lie on the $x_1$-axis. In this case, $K_+(x_0)$ is a convex arc located entirely on one side of the $x_1$-axis, with tangents at its ends parallel to the $x_2$-axis (Fig. 1a). If $x_0$ does not lie on the $x_1$-axis, then $\xi_+(x_0)$ and $\xi_-(x_0)$ lie on opposite sides of the axis, and the arc intersects the $x_1$-axis exactly once (Fig. 1b). This follows from the fact that the vector $\psi(t)$ rotates clockwise, and at two consecutive zeros of $\psi_2(t)$, the values of $\psi_1(t)$ must have opposite signs.

The convexity of these arcs follows from the unidirectional rotation of the tangent vector. The fact that the arc $K_+(x_0)$ can intersect the $x_1$-axis at most once is proven by contradiction: if it intersected twice, the phase velocity vector would be parallel to the $x_1$-axis at two points, which would contradict the property that $\psi_2(t)$ has no zeros between $\tau_+$ and $0$.

Next, we construct the trajectories arriving at the origin that satisfy the maximum principle. Starting from the origin $o$, we construct the arc $K_+(o)$. This arc lies entirely below the $x_1$-axis, and its starting point $\zeta_1$ lies on the $x_1$-axis (Fig. 4). For any point $z$ on $K_+(o)$, we can define an arc $K_-(z)$ ending at $z$. Let the starting point of this arc be $\zeta_2$. Continuing this process, we construct a sequence of arcs $K_+(o), K_-(\zeta_1), K_+(\zeta_2), \dots$, forming a trajectory $\eta_+(\zeta_0)$. Similarly, we construct $\eta_-(\zeta_1)$ starting with $K_-(o)$. All such trajectories satisfy the maximum principle and consist of alternating segments where $u = +1$ and $u = -1$.

The topological structure of these trajectories in the phase plane forms a series of curvilinear quadrilaterals $Q_i^-, Q_i^+$. Each trajectory, except those forming the boundaries of these quadrilaterals, enters a quadrilateral once and exits it at a non-zero angle. The switching line consists of the union of the arcs $K_+(o), K_-(o)$ and their subsequent mappings.

Conclusion

The final result for the time-optimal control of the object (2)–(7) is as follows: the optimal trajectories arriving at the origin are spirals consisting of alternating segments corresponding to $u = +1$ and $u = -1$. The switching line consists of the constructed arcs, and the synthesis of optimal controls is realized by a function $u(x_1, x_2)$ that takes the value $-1$ above the switching line and $+1$ below it. The qualitative behavior of the optimal trajectories is shown in Fig. 10.

References

  1. Pontryagin, L. S., Boltyanskii, V. G., Gamkrelidze, R. V., Mishchenko, E. F. The Mathematical Theory of Optimal Processes. Fizmatgiz, 1961.
  2. Boltyanskii, V. G. Izv. AN SSSR, Ser. Mat., 28, no. 3, 481–514, 1964.
  3. Boltyanskii, V. G., Roitenberg, E. Ya. Kibernetika, 52–56, 1966.
  4. Boltyanskii, V. G. Mathematical Methods of Optimal Control. "Nauka", 1966.

Received by the Editorial Board September 1966. V.A. Steklov Mathematical Institute.

Submission history

The synthesis of optimal controls in nonlinear oscillatory systems of second order