On the Optimal Values and Points of a Nonlinear Function
Unknown
Submitted 1963-01-01 | SovietRxiv: ru-196301.43997 | Translated from Russian

Abstract Generated abstract

The paper studies the minimization of a nonlinear sum of products of exponential terms with bases between zero and one, subject to row-sum and nonnegativity constraints on a matrix of variables. It introduces characteristic numbers associated with optimal points, proves their uniqueness, and uses them to identify which bases can be active at an optimum. The argument reduces the general problem with several rows to a one-row problem once the characteristic numbers are known, and then gives an inductive procedure, including a bisection method, for determining these numbers from lower-dimensional cases. The paper also describes how to reconstruct optimal points from the reduced problems and notes an approximation estimate for characteristic numbers obtained by bisection.

Full Text

MATHEMATICS

I. A. Bakhtin

ON THE OPTIMAL VALUES AND POINTS OF ONE NONLINEAR FUNCTION

(Presented by Academician S. L. Sobolev on 13 VIII 1962)

The present paper is devoted to the problem of finding the least value and the points at which this value is attained, of the function

\[ \Phi=\sum_{j=1}^{L}\prod_{i=1}^{N}\alpha_{ij}^{x_{ij}} \qquad (0<\alpha_{ij}\leqslant 1) \tag{1} \]

on the polyhedron

\[ \sum_{j=1}^{L}x_{ij}=m_i \qquad (m_i>0,\quad i=1,\ldots,N); \tag{2} \]

\[ x_{ij}\geqslant 0 \qquad (i=1,\ldots,N;\ j=1,\ldots,L). \tag{3} \]

This problem for the special cases \(N=1,2\) was solved jointly by M. A. Krasnosel’skii, A. Yu. Levin, and I. A. Bakhtin.

We shall assume that none of the sets \(\{\alpha_{1j}\}\), \(\{\alpha_{2j}\},\ldots,\{\alpha_{Nj}\}\) consists entirely of ones, since the general case is easily reduced to this one if the function (1) is not constant.

  1. Below we use the following notation and definitions. The points of definition of the function (1) will be denoted by matrices

\[ X=(x_{ij})= \begin{pmatrix} x_{11}&\cdots&x_{1L}\\ \cdots&\cdots&\cdots\\ x_{N1}&\cdots&x_{NL} \end{pmatrix}. \tag{4} \]

The least value of the function (1) on the polyhedron (2)—(3) and the points

\[ X_0=(x_{ij}^{0})= \begin{pmatrix} x_{11}^{0}&\cdots&x_{1L}^{0}\\ \cdots&\cdots&\cdots\\ x_{N1}^{0}&\cdots&x_{NL}^{0} \end{pmatrix}, \tag{5} \]

at which the function assumes this value, will be called optimal. The existence of optimal values and points of the function (1) on the polyhedron (2)—(3) is obvious. Therefore it suffices to indicate a method for finding these quantities.

The set of bases \((\alpha_{1j},\alpha_{2j},\ldots,\alpha_{Nj})\) will be called active if, at the optimal point (5), at least one of the numbers of the set \((x_{1j}^{0},x_{2j}^{0},\ldots,x_{Nj}^{0})\) is different from zero. In the opposite case this set of bases will be called inactive. Similarly, the activity or inactivity of each of the bases \(\alpha_{ij}\) \((i=1,\ldots,N;\ j=1,\ldots,L)\) is defined: the base \(\alpha_{ij}\) is active if \(x_{ij}^{0}\ne 0\), and inactive if \(x_{ij}^{0}=0\).

  1. Let \(i\) be an arbitrary number of the set \((1,2,\ldots,N)\). Mark in the \(i\)-th row of the optimal point (5) all numbers different from zero,

\[ x_{ij_1}^{0},\ x_{ij_2}^{0},\ldots,\ x_{ij_k}^{0} \qquad (k\leqslant L). \tag{6} \]

It turns out that there exists a positive number \(h_i\) such that for all \(j=j_1,j_2,\ldots,j_k\)

\[ h_i=|\ln \alpha_{ij}|\,\alpha_{1j}^{x_{1j}^0}\alpha_{2j}^{x_{2j}^0}\cdots \alpha_{Nj}^{x_{Nj}^0} \tag{7} \]

and for all the remaining \(j\)

\[ h_i\geq |\ln \alpha_{ij}|\,\alpha_{1j}^{x_{1j}^0}\alpha_{2j}^{x_{2j}^0}\cdots \alpha_{Nj}^{x_{Nj}^0}. \tag{8} \]

The numbers \(h_1,h_2,\ldots,h_N;\ \varkappa_1=1,\ \varkappa_2=h_2/h_1,\ldots,\varkappa_N=h_N/h_1\) will be called the characteristic numbers of function (1).

Theorem 1. Function (1), defined on the polyhedron (2)—(3), has a unique system of characteristic numbers \(h_1,h_2,\ldots,h_N;\ \varkappa_1,\varkappa_2,\ldots,\varkappa_N\).

It follows from Theorem 1 that if function (1) assumes its optimal value not at one point, but at several points of the polyhedron (2)—(3), then the systems of characteristic numbers \(h_1,\ldots,h_N;\ \varkappa_1,\ldots,\varkappa_N\) corresponding to these points coincide. Knowledge of the characteristic numbers \(\varkappa_2,\ldots,\varkappa_N\) of function (1) makes it possible simply to determine which of the bases \(\alpha_{ij}\) may be active.

Theorem 2. If \(\varkappa_1=1,\ \varkappa_2,\ldots,\varkappa_N\) are the characteristic numbers of function (1), defined on the polyhedron (2)—(3), then in each of the aggregates \((\alpha_{1j},\alpha_{2j},\ldots,\alpha_{Nj})\) only those bases \(\alpha_i\) \((i=1,\ldots,N)\) may be active for which

\[ \alpha_{ij}^{1/\varkappa_i}=\min\{\alpha_{1j},\ \alpha_{2j}^{1/\varkappa_2},\ldots,\alpha_{Nj}^{1/\varkappa_N}\}. \tag{9} \]

3. In this section we shall give a theorem reducing the general case \(N>1\) to the case \(N=1\), when the characteristic numbers \(\varkappa_2,\ldots,\varkappa_N\) are known. Consider the function

\[ \varphi=\sum_{j=1}^{L}\beta_j^{y_j}, \tag{10} \]

where \(\beta_j=\min\{\alpha_{1j},\ \alpha_{2j}^{1/\varkappa_2},\ldots,\alpha_{Nj}^{1/\varkappa_N}\}\), defined on the polyhedron

\[ \sum_{j=1}^{L} y_j=m_1+\varkappa_2 m_2+\cdots+\varkappa_N m_N,\qquad y_j\geq 0. \tag{11} \]

Let

\[ (y_1^0,y_2^0,\ldots,y_L^0) \tag{12} \]

be an optimal point of function (10) on the polyhedron (11).

Theorem 3. The least value of function (1) on the polyhedron (2)—(3) coincides with the least value of function (10) on the polyhedron (11), and for any index \(j\) \((j=1,\ldots,L)\) and optimal sets (5) and (12)

\[ y_j^0=x_{1j}^0+\varkappa_2 x_{2j}^0+\cdots+\varkappa_N x_{Nj}^0. \tag{13} \]

Theorems 1 and 3 show that the problem of finding the optimal quantities of function (1) on the polyhedron (2)—(3) reduces to the problem of finding its characteristic numbers \(\varkappa_2,\varkappa_3,\ldots,\varkappa_N\).

We shall give one method for finding these numbers, based on knowledge of the solution of the problem for the cases \(1,2,\ldots,N-1\).

4. From relations (7) and (8) and from the definition of the numbers \(\varkappa_2,\varkappa_3,\ldots,\varkappa_N\) it follows that each of the numbers \(\varkappa_i\) \((i=2,\ldots,N)\) is either equal to one of

from the terms of the sequence

\[ |\ln \alpha_{i1}|/|\ln \alpha_{11}|,\quad |\ln \alpha_{i2}|/|\ln \alpha_{12}|,\ldots,\quad |\ln \alpha_{iL}|/|\ln \alpha_{1L}|, \tag{14} \]

or is contained between some two of the terms of this sequence. Let

\[ \gamma_1<\gamma_2<\cdots<\gamma_P\quad (P\leq L) \tag{15} \]

be the sequence composed of all distinct terms of the sequence (14) for \(i=N\). Obviously, the finite positive number \(\chi_N\) satisfies the relation

\[ \gamma_1\leq \chi_N\leq \gamma_P. \tag{16} \]

Denote by \(\chi\) an arbitrary positive number. Using this number and the function (1)—(3), construct a new function

\[ F=\sum_{j=1}^{L}\prod_{i=1}^{N-1}\beta_{ij}^{z_{ij}}, \tag{17} \]

where \(\beta_{1j}=\min\{\beta_{1j},\alpha_{Nj}^{1/\chi}\}\) and \(\beta_{ij}=\alpha_{ij}\) for \(i=2,\ldots,N-1\), on the polyhedron

\[ \sum_{j=1}^{L} z_{1j}=m_1+\chi m_N,\qquad \sum_{j=1}^{L} z_{ij}=m_i\quad (i=2,\ldots,N-1); \tag{18} \]

\[ z_{ij}\geq 0\quad (i=1,\ldots,N-1;\ j=1,\ldots,L). \tag{19} \]

Let

\[ Z_0=(z_{ij}^0)= \begin{pmatrix} z_{11}^0 & \cdots & z_{1L}^0\\ \cdots & \cdots & \cdots\\ z_{(N-1)1}^0 & \cdots & z_{(N-1)L}^0 \end{pmatrix} \tag{20} \]

be an optimal point of the function (17) on the polyhedron (18)—(19). Denote by \(\sum'\) the summation extended over all such indices \(j\) (\(j=1,\ldots,L\)) for which \(\beta_{1j}=\alpha_{1j}\), and by \(\sum''\) the summation extended over all \(j\) (\(j=1,\ldots,L\)) for which \(\beta_{1j}=\alpha_{Nj}^{1/\chi_N}\).

Theorem 4. If \(\sum' z_{1j}^0<m_1\), then \(\chi<\chi_N\); if \(\sum' z_{1j}^0\geq m_1\), \(\sum' z_{1j}^0\geq \chi m_N\), then \(\chi=\chi_N\); and if \(\sum' z_{1j}^0<\chi m_N\), then \(\chi>\chi_N\). These cases are mutually exclusive.

Theorem 4 permits the following bisection method to be applied for finding the number \(\chi_N\). The sequence (15) is divided by its middle term \(\gamma_{p_1}\) into two parts

\[ I_1':\quad \gamma_1<\gamma_2<\cdots<\gamma_{p_1}, \]

\[ I_1'':\quad \gamma_{p_1+1}<\gamma_{p_1+2}<\cdots<\gamma_P. \]

Then the number \(\gamma_{p_1}\) is compared with \(\chi_N\) by means of Theorem 4. If \(\gamma_{p_1}<\chi_N\), then the part \(I_1''\) is further divided by the middle term into parts \(I_2'\), \(I_2''\); if \(\gamma_{p_1}>\chi_N\), then \(I_1'\) is divided. In the case \(\gamma_{p_1}=\chi_N\) the process terminates at this point. Continuing this process further, at some step we either determine the number \(\chi_N\), or indicate an interval containing \(\chi_N\), whose endpoints are two adjacent terms of the sequence (15). Note that if \(\chi=\chi_N\), then all other characteristic numbers \(\chi_2,\ldots,\chi_{N-1}\) of the function (1) coincide with the corresponding characteristic numbers \(\delta_2,\ldots,\delta_{N-1}\) of the function (17).

Carrying out analogous computations for the numbers \(\chi_2,\ldots,\chi_{N-1}\), we either find the exact values \(\chi_2,\ldots,\chi_{N-1}\), or prove, with the aid of the theo-

from Theorem 2, that the least value of the function (1)—(3) coincides with the least value of a function of the form

\[ f=\sum_{s=1}^{k}\alpha_{1j_s}^{x_{1j_s}}+ \sum_{s=k+1}^{L}\prod_{i=2}^{N}\alpha_{ij_s}^{x_{ij_s}}, \tag{21} \]

where the first sum on the right-hand side extends over all such \(s\) for which

\[ \alpha_{1j_s}<\min\{\alpha_{2j_s}^{1/x_2},\ldots,\alpha_{Nj_s}^{1/x_N}\}, \]

on the polyhedron

\[ \sum_{s=1}^{k}x_{1j_s}=m_1,\qquad x_{1j_s}\geq 0; \tag{22} \]

\[ \sum_{s=k+1}^{L}x_{ij_s}=m_i,\qquad x_{ij_s}\geq 0\quad (i=2,\ldots,N). \tag{23} \]

The characteristic numbers of the function (1)—(3) coincide with the characteristic numbers of the function (21)—(23).

  1. Let us note that if \(\varkappa=\varkappa_N\), then the optimal point (5) is constructed from the optimal point (20) as follows: set \(x_{ij}^{0}=z_{ij}^{0}\) if \(2\leq j\leq N-1\); set \(x_{1j}^{0}=z_{1j}^{0}\), \(x_{Nj}^{0}=0\), if \(\alpha_{1j}=\beta_{1j}<\alpha_{Nj}^{1/\varkappa}\); set \(x_{1j}^{0}=0\), \(x_{Nj}^{0}=z_{1j}^{0}/\varkappa\), if \(\alpha_{1j}>\beta_{1j}=\alpha_{Nj}^{1/\varkappa}\); for the remaining indices \(j\), choose nonnegative \(x_{1j}^{0}\) and \(x_{Nj}^{0}\) arbitrarily, but subject to the condition that \(x_{1j}^{0}+\varkappa x_{Nj}^{0}=z_{1j}^{0}\) and

\[ \sum_{j=1}^{L}x_{1j}^{0}=m_1. \]

In the case when the characteristic numbers of the function (1) are determined by means of the function (21), the optimal point (5) of the function (1) is constructed as follows: the coordinates of the optimal point of the function (21) are taken as the corresponding coordinates of the point (5). The remaining coordinates \(x_{ij}^{0}\) of the point (5) are set equal to zero.

In conclusion, we remark that if the number \(\varkappa_N\) is found by the method of bisection of the interval \([\gamma_1,\gamma_p]\) (for simplicity, the number \(\gamma_p\) is assumed different from infinity) with accuracy up to \(\varepsilon\) (\(|\varkappa-\varkappa_N|<\varepsilon\)), then the characteristic numbers \(\delta_2,\ldots,\delta_{N-1}\) of the function (17) approximate the corresponding numbers \(\varkappa_2,\ldots,\varkappa_{N-1}\) of the function (1) with accuracy up to \(\varepsilon\cdot c\), where \(c\) is a certain constant determined from the bases \(\alpha_{ij}\).

Voronezh State
Pedagogical Institute

Received
7 VIII 1962

Submission history

On the Optimal Values and Points of a Nonlinear Function