Abstract Generated abstract
This note studies iterative methods for solving nonlinear equations of the form F(x) = 0 in a real Hilbert space, where F is the gradient of a twice continuously differentiable functional. Under strong monotonicity and boundedness assumptions on the derivative, it establishes existence and uniqueness of the solution in specified balls and proves geometric convergence of fixed step iterations, including a version allowing bounded computational errors. The paper further analyzes variable metric and Newton type iterations with line minimization, showing convergence from arbitrary initial approximations to the minimizer of the functional, with geometric convergence in general and faster than geometric convergence when the approximating operators tend to the inverse derivative at the solution. It emphasizes that the results rely essentially on the variational structure and symmetry of the derivative, distinguishing the method from general operator equations and from classical Newton and steepest descent methods.
Full Text
MATHEMATICS
M. N. YAKOVLEV
ON THE SOLUTION OF NONLINEAR EQUATIONS BY THE METHOD OF ITERATIONS
(Presented by Academician V. I. Smirnov on 22 I 1964)
Let in a real Hilbert space \(H\) there be given a twice continuously differentiable functional \(f(x)\) and let \(F(x)\) be the gradient of the functional \(f(x)\), i.e., the operator defined by the formula
\[ \lim_{t\to 0}\frac{f(x+th)-f(x)}{t}=(F(x),h) \]
for arbitrary \(h\in H\) (see, for example, (1)).
In this note we consider iterative methods for solving the equation
\[ F(x)=0. \tag{1} \]
1. Theorem 1. Let there exist \(x_0\in H\) and a number \(r>0\) such that:
a) \((F'(x)h,h)\geq m(h,h)\), \(m>0\) for \(h\in H\) and \(x\in S(x_0,r)\);
b) \(r\geq \|F(x_0)\|/m\);
c) \((F'(x)h,h)\leq M(h,h)\) for \(h\in H\) and \(x\in S(x_0,\|F(x_0)\|/m)\).
Then in the sphere \(S(x_0,\|F(x_0)\|/m)\) there exists a solution \(x^*\) of equation (1), unique in the sphere \(S(x_0,r)\), and the iterative process starting from \(x_0\) converges to it:
\[ x_{n+1}=x_n-\frac{2}{M+m}F(x_n)\quad (n=0,1,2,\ldots), \]
moreover
\[ \|x_n-x^*\|\leq \left(\frac{M-m}{M+m}\right)^n\|x_0-x^*\|. \]
Remark. Under the conditions of Theorem 1, the solution \(x^*\) of the equation \(F(x)=0\), unique in \(S(x_0,r)\), is also the unique point of minimum of the functional \(f(x)\) in \(S(x_0,r)\).
In the case when the computations are carried out with errors, the following holds:
Theorem 2. Let there exist \(x_0\in H\) and numbers \(r>0\) and \(\bar h>0\) such that:
a) \((F'(x)h,h)\geq m(h,h)\), \(m>0\) for \(h\in H\) and \(x\in S(x_0,r)\);
b)
\[ r\geq \frac{\|F(x_0)\|+\bar h}{m}; \]
c) \((F'(x)h,h)\leq M_1(h,h)\) for \(h\in H\) and
\[ x\in S\left(x_0,\frac{1}{m}\bigl[\|F(x_0)\|+\bar h\bigr]\right). \]
Then for any sequence \(h_n\in H\) with \(\|h_n\|\leq \bar h\), the iterative process starting from \(x_0\),
\[ x_{n+1}=x_n-\frac{2}{M_1+m}\,[F(x_n)+h_n]\quad (n=0,1,2,\ldots) \]
gives a sequence \(x_n\) such that
\[ \|x_n-x^*\|\leq \left(\frac{M_1-m}{M_1+m}\right)^n\|x_0-x^*\|+\frac{\bar h}{m}. \]
- Theorem 3. Suppose
\[ (F'(x)h,h)\geq m(h,h),\qquad m>0,\quad x,h\in H; \]
\(A_n\) is a sequence of symmetric, bounded, positive definite operators such that
\[ \overline m(A_n^{-1}h,h)\leq (F'(x)h,h)\leq \overline M(h,h), \]
\[ 0<\overline m\leq \overline M<\infty \quad\text{for } h\in H \text{ and } x\in S\left(x^*,\sqrt{\frac{2}{m}\,[f(x_0)-f(x^*)]}\right), \]
where \(x_0\) is some element of \(H\).
Then the function \(\psi(\alpha)=f(x_n-\alpha A_nF(x_n))\) has a unique minimum \(\alpha_n\), and the iterative process, starting from \(x_0\),
\[ x_{n+1}=x_n-\gamma_n A_nF(x_n)\qquad (n=0,1,2,\ldots), \]
where \(\gamma_n\) are arbitrary numbers satisfying the condition
\(\varepsilon\leq\gamma_n\leq\alpha_n,\ \varepsilon>0\), gives a minimizing sequence for \(f(x)\), with \(f(x_n)\to f(x^*)\) and \(x_n\to x^*\) at the rate of a geometric progression with ratios respectively \(\tau\) and \(\sqrt{\tau}\), where
\[ \tau=1-\varepsilon_1\left(1-\frac{\varepsilon_1\overline M}{2}\right)\frac{2\overline m^2}{M}, \qquad \varepsilon_1=\min\left\{\varepsilon,\frac{1}{\overline M}\right\}. \]
In particular, the iterative process
\[ x_{n+1}=x_n-\alpha_n A_nF(x_n)\qquad (n=0,1,2,\ldots) \]
converges, starting from any \(x_0\), at the rate of a geometric progression with ratio
\[ \sqrt{1-m^2/M^2}. \]
- Corollary 1. Suppose
\[ (F'(x)h,h)\geq m(h,h),\qquad m>0,\quad x,h\in H \]
and \(A_n\) is a sequence of symmetric, bounded, positive definite operators, with
\[ m_A(h,h)\leq (A_nh,h)\leq M_A(h,h),\qquad h\in H,\quad 0<m_A\leq M_A<+\infty. \]
Then the function \(\psi(\alpha)=f(x_n-\alpha A_nF(x_n))\) has a unique minimum \(\alpha_n\), and the iterative process
\[ x_{n+1}=x_n-\gamma_n A_nF(x_n)\qquad (n=0,1,2,\ldots), \]
where \(\varepsilon\leq\gamma_n\leq\alpha_n,\ \varepsilon>0\), starting from any \(x_0\), gives a minimizing sequence for \(f(x)\), with \(f(x_n)\to f(x^*)\) and \(x_n\to x^*\) at the rate of a geometric progression.
Corollary 2. Suppose
\[ (F'(x)h,h)\geq m(h,h),\qquad m>0,\quad x,h\in H. \]
Then the iterative process
\[ x_{n+1}=x_n-\gamma_n [F'(x_n)]^{-1}F(x_n)\qquad (n=0,1,2,\ldots) \]
with \(\varepsilon\leq\gamma_n\leq\alpha_n,\ \varepsilon>0\), where \(\alpha_n\) is the unique minimum of the function
\(f(x_n-\alpha [F'(x_n)]^{-1}F(x_n))\), gives a minimizing sequence for \(f(x)\), with \(f(x_n)\to f(x^*)\) and \(x_n\to x^*\) at the rate of a geometric progression.
Theorem 4. Suppose the conditions of Theorem 3 are fulfilled and
\[ \|A_n-[F'(x^*)]^{-1}\|\to 0\qquad \text{as } n\to\infty. \]
Then the iterative process
\[ x_{n+1}=x_n-\alpha_n A_nF(x_n)\qquad (n=0,1,2,\ldots) \]
converges, starting from any \(x_0\), to \(x^*\) at a rate higher than a geometric progression, i.e.,
\[
\|x_n-x^*\|\leq Cq_1q_2\ldots q_n,\qquad q_k<1,\qquad q_k\to 0\quad \text{as } k\to\infty .
\]
Corollary 3. Let the condition
\[
(F'(x)h,h)\geq m(h,h),\qquad m>0,\qquad x,h\in H
\]
be satisfied. Then the iterative process
\[
x_{n+1}=x_n-\alpha_n [F'(x_n)]^{-1}F(x_n)\qquad (n=0,1,2,\ldots),
\]
where \(\alpha_n\) is determined from the condition that the function
\[
\psi(\alpha)=f\bigl(x_n-\alpha [F'(x_n)]^{-1}F(x_n)\bigr)
\]
attain its minimum, converges to \(x^*\) at a rate greater than that of a geometric progression.
The equation considered in this paper is not the general equation \(\Phi(x)=0\), where \(\Phi(x)\) is an operator from \(\mathcal H\) into itself. The proofs of the assertions indicated above essentially use the fact that \(F(x)\) is the gradient of a functional. In particular, it is essential that \(F'(x)\) be a symmetric operator. In the finite-dimensional case of \(n\) dimensions, the problem considered is the problem of finding a stationary point.
The iterative process
\[
x_{n+1}=x_n-\alpha_n [F'(x_n)]^{-1}F(x_n)\qquad (n=0,1,2,\ldots),
\]
unlike Newton’s method, which for its convergence requires an initial approximation close to the solution \(x^*\), converges from any initial approximation.
Unlike the method of steepest descent, the rate of convergence increases as the solution is approached.
Received 17 I 1964CITED LITERATURE
- M. M. Vainberg, Variational Methods for the Study of Nonlinear Operators, Moscow, 1956.