Abstract Generated abstract
This paper derives elementwise upper estimates for the spherical matrix norm, or spectral norm, and for the corresponding logarithmic norm, without using the block decompositions employed in earlier work. It presents a general comparison principle based on bounds for the spectral radius of positive Hermitian matrices, obtains consequences involving row, column, and Frobenius type norms, and constructs modified norms that yield explicit logarithmic norm estimates. Further theorems give bounds using selected diagonal entries and off-diagonal sums, including variants related to Ostrowski-type eigenvalue estimates. A numerical example indicates that the resulting estimates can improve previously reported bounds for the spectral radius and spectral abscissa.
Full Text
UDC 512.831
MATHEMATICS
S. M. LOZINSKII
ESTIMATES OF THE SPHERICAL MATRIX NORM AND OF THE CORRESPONDING LOGARITHMIC NORM
(Presented by Academician V. I. Smirnov on 13 IV 1965)
-
Notation. Up to and including Sec. 5, all notation and definitions from (1) are adopted. The symbols \(\Phi_{\mathrm I}\), \(\Phi_{\mathrm{II}}\), \(\Phi_{\mathrm{III}}\) denote the matrix norms induced, respectively, by the 1st, 2nd, and 3rd vector norms \(\bigl((2),\ \text{p. }121\bigr)\), i.e., the norms obtained from (1), Sec. 6, for \(p=+\infty\), \(p=1\), \(p=2\). \(\Phi_{\mathrm{III}}\) is often called the spherical matrix norm. The symbol \(\gamma_{\mathrm{III}}\) denotes the logarithmic matrix norm corresponding to \(\Phi_{\mathrm{III}}\). If \(A \in \mathbf M\), then \(A^*\) denotes the matrix Hermitian-conjugate to \(A\). If \(A\) is a square matrix, then
\[ \widetilde A \overset{\mathrm{def}}{=} \frac12(A+A^*). \]
If \(\Sigma\) denotes a simple or double sum, then \(\Sigma'\) denotes the sum obtained from \(\Sigma\) by replacing by zeros the terms with equal indices. For any \(A \in \mathbf M\) we have (see (2), pp. 125–127)
\[ \Phi_{\mathrm I}(A)=\max_\mu \sum_\nu |a_{\mu\nu}|; \tag{1} \]
\[ \Phi_{\mathrm{II}}(A)=\max_\nu \sum_\mu |a_{\mu\nu}|; \tag{2} \]
\[ \Phi_{\mathrm{III}}(A)=\sqrt{\rho(AA^*)}=\sqrt{\rho(A^*A)}, \tag{3} \]
and for a square matrix \(A\), in addition (see (3), pp. 59–60),
\[ \gamma_{\mathrm{III}}(A)=\sigma(\widetilde A). \tag{4} \] -
In (1), estimates were obtained for \(\Phi_{\mathrm{III}}(A)\) and \(\gamma_{\mathrm{III}}(A)\) in terms of the elements of the matrix \(A\), using a decomposition of the matrix into blocks. Here we give estimates that do not use such a decomposition.
-
Let \(\Phi\) be a real-valued function defined on the set of Hermitian matrices having only nonnegative eigenvalues, and such that on this set \(\rho(A) \leqslant \Phi(A)\). Define on \(\mathbf M\) the functions \(L_\Phi\) and \(R_\Phi\) by the formulas
\[ L_\Phi(A)\overset{\mathrm{def}}{=}\sqrt{\Phi(AA^*)}; \tag{5_1} \]
\[ R_\Phi(A)\overset{\mathrm{def}}{=}\sqrt{\Phi(A^*A)}. \tag{5_2} \] -
Theorem 1. Under the assumptions of Sec. 3, for \(A \in \mathbf M\) we have
\[ \Phi_{\mathrm{III}}(A)\leqslant L_\Phi(A); \tag{6_1} \]
\[ \Phi_{\mathrm{III}}(A)\leqslant R_\Phi(A). \tag{6_2} \]
Proof. \(\rho(AA^*) \leqslant \Phi(AA^*)\). Applying (3), we obtain \((6_1)\). Similarly for \((6_2)\).
If \(\Phi\) is expressed explicitly in terms of the elements of the matrix, then inequalities (6) give an estimate of \(\Phi_{\mathrm{III}}(A)\) expressed explicitly in terms of the elements of the matrix \(A\).
For example, putting in (6) \(\Phi=\Phi_{\mathrm I}\), we obtain
\[ \Phi_{\mathrm{III}}(A)\leq \min\left(\sqrt{\Phi_{\mathrm I}(AA^*)},\sqrt{\Phi_{\mathrm I}(A^*A)}\right) \quad \text{for } A\in \mathbf M . \tag{7} \]
Using the multiplicativity of \(\Phi_{\mathrm I}\) \(\bigl((^3), p. 56\bigr)\), we obtain
\[ \Phi_{\mathrm I}(AA^*)\leq \Phi_{\mathrm I}(A)\Phi_{\mathrm I}(A^*)= \Phi_{\mathrm I}(A)\Phi_{\mathrm {II}}(A), \]
which together with (7) gives
\[ \Phi_{\mathrm{III}}(A)\leq \sqrt{\Phi_{\mathrm I}(A)\Phi_{\mathrm {II}}(A)} \quad \text{for } A\in \mathbf M . \tag{8} \]
- Define on \(\mathbf M\) the function \(N\) by the formula
\[ N(A)\stackrel{\mathrm{def}}{=} \left(\sum_{\mu=1}^{m}\sum_{\nu=1}^{n}|a_{\mu\nu}|^2\right)^{1/2} \quad \text{for } A\in \mathbf M_{m\times n}; \tag{9} \]
\(N\), as is known, is a matrix norm, and \(\Phi_{\mathrm{III}}\leq N\) on \(\mathbf M\).
-
From this point on (with the exception of item 13), we consider only square matrices of fixed order \(n\), and the term matrix norm is used in the sense of a norm on \(\mathbf M_n\). The remaining symbols and terms are as above.
-
If \(A\) is a square matrix, then the symbol \(D_A\) (\(S_A\)) denotes the matrix obtained from \(A\) by replacing the off-diagonal (diagonal) elements by zeros.
-
Let \(\Phi\) be a nonnegative numerical function on \(\mathbf M_n\). Define on \(\mathbf M_n\) the function \(\overline{\Phi}\) by the formula
\[ \overline{\Phi}(A)\stackrel{\mathrm{def}}{=} \max\left(|a_{11}|,\ldots,|a_{nn}|\right)+\Phi(S_A). \tag{10} \]
- Theorem 2. Let \(\Phi\) and \(\overline{\Phi}\) be as in item 8. Then:
1) If \(\Phi\) is a matrix norm, then \(\overline{\Phi}\) is a matrix norm, \(\overline{\Phi}(E_n)=1\), and
\(\gamma_{\overline{\Phi}}(A)=\max_k \operatorname{Re} a_{kk}+\Phi(S_A)\).
2) If \(\Phi_{\mathrm{III}}\leq \Phi\) on \(\mathbf M_n\), then \(\Phi_{\mathrm{III}}\leq \overline{\Phi}\) on \(\mathbf M_n\).
3) If \(\Phi\) is a matrix norm and \(\Phi_{\mathrm{III}}\leq \Phi\) on \(\mathbf M_n\), then \(\gamma_{\mathrm{III}}\leq \gamma_{\overline{\Phi}}\) on \(\mathbf M_n\).
Proof. 1) This is proved by straightforward computations.
2) Let \(\Phi_{\mathrm{III}}\leq \Phi\) on \(\mathbf M_n\). Fix \(A\in \mathbf M_n\). We have \(A=D_A+S_A\), which gives
\[ \Phi_{\mathrm{III}}(A)\leq \Phi_{\mathrm{III}}(D_A)+\Phi_{\mathrm{III}}(S_A)\leq \]
\[ \leq \max_k |a_{kk}|+\Phi(S_A)=\overline{\Phi}(A). \]
3) This follows from 2) and the definition of the logarithmic norm.
Remark. The function
\[ \overline{N}(A)=\max\left(|a_{11}|,\ldots,|a_{nn}|\right)+N(S_A) \tag{11} \]
will be called the norm of E. Schmidt (see \((^4)\), p. 135). By Theorem 2(1) and (3),
\[ \gamma_{\mathrm{III}}(A)\leq \gamma_{\overline{N}}(A)= \max_k \operatorname{Re} a_{kk}+N(S_A). \tag{12} \]
- Let \(m\) be a natural number, \(2\leq m\leq n\), and let \(\varkappa\) denote a fixed set of pairwise distinct integers \(k_1,\ldots,k_m\), where \(1\leq k_1,\ldots,k_m\leq n\). Put
\[ T_{\varkappa}(A)\stackrel{\mathrm{def}}{=} \sum_{\mu=1}^{m}\operatorname{Re} a_{k_\mu k_\mu}; \tag{13} \]
\[ N_{\varkappa}(A)\stackrel{\mathrm{def}}{=} \left(\sum_{\mu=1}^{m}|a_{k_\mu k_\mu}|^2+ \sum_{\mu,\nu=1}^{n}{}' |a_{\mu\nu}|^2\right)^{1/2}; \tag{14} \]
\[ \Gamma_\varkappa(A) \overset{\mathrm{def}}{=} \left([N_\varkappa(A)]^2-\frac{[T_\varkappa(A)]^2}{m}\right)^{1/2}; \tag{15} \]
\[ \alpha_\varkappa(A) \overset{\mathrm{def}}{=} \frac{T_\varkappa(A)}{m}-\frac{1}{\sqrt{m(m-1)}}\,\Gamma_\varkappa(A); \tag{16} \]
\[ Z_\varkappa(A)=\frac{T_\varkappa(A)}{m}+\sqrt{\frac{m-1}{m}}\,\Gamma_\varkappa(A). \tag{17} \]
If \(m=n\), then the index \(\varkappa\) on \(T_\varkappa, N_\varkappa, \Gamma_\varkappa, Z_\varkappa\) will sometimes be omitted. Thus, for example, \(T(A)=\operatorname{Re}\operatorname{tr} A\).
- Theorem 3. Let \(A\in M_n\). Then:
1a) If \(m<n\) and
\[ \alpha_\varkappa(A)\ge |a_{jj}|,\qquad \text{for } j\notin \varkappa, \tag{18} \]
then \(\Phi_{\mathrm{III}}(A)\le Z_\varkappa(A)\).
1b) If \(m=n\) and \(T(A)\ge N(A)\), then \(\Phi_{\mathrm{III}}(A)\le Z(A)\).
2a) If \(m<n\) and
\[ \alpha_\varkappa(A)\ge \operatorname{Re} a_{jj}\qquad \text{for } j\notin \varkappa, \tag{19} \]
then \(\gamma_{\mathrm{III}}(A)\le Z_\varkappa(A)\).
2b) \(\gamma_{\mathrm{III}}(A)\le Z(A)\).
3) If \(m<n\) and \(\beta\ge |a_{jj}|\) for \(j\notin \varkappa\), then
\[ \Phi_{\mathrm{III}}(A)\le \begin{cases} \beta+\bigl([N_\varkappa(A)]^2-2\beta T_\varkappa(A)+m\beta^2\bigr)^{1/2},\\ \beta+\Gamma_\varkappa(A)\quad \text{provided } \beta\ge |T_\varkappa(A)|/m. \end{cases} \tag{20} \]
4) If \(m<n\) and \(\beta\ge \operatorname{Re} a_{jj}\) for \(j\notin \varkappa\), then the inequality is valid which is obtained from (20) by replacing \(\Phi\) on the left by \(\gamma\); moreover, in the second line of formula (20) one may replace \(|T_\varkappa(A)|\) by \(T_\varkappa(A)\).
Proof. Let the matrix \(D\) be obtained from \(D_A\) by replacing, for \(j\in \varkappa\), the element \(a_{jj}\) by \(\alpha_\varkappa(A)\). Then, as is easy to calculate,
\[ \alpha_\varkappa(A)+N(A-D)=Z_\varkappa(A). \]
Therefore, if \(m<n\) and (18) holds, then
\[ \Phi_{\mathrm{III}}(A)\le \Phi_{\mathrm{III}}(D)+\Phi_{\mathrm{III}}(A-D)\le \alpha_\varkappa(A)+N(A-D)=Z_\varkappa(A), \tag{21} \]
which proves 1a). If \(m=n\), then \(T(A)\ge N(A)\) is equivalent to \(\alpha_\varkappa(A)\ge 0\) and, consequently, the conditions of 1b) give (21); this proves 1b). If \(m<n\) and (19) holds, then
\[ \gamma_{\mathrm{III}}(A)\le \gamma_{\mathrm{III}}(D)+\gamma_{\mathrm{III}}(A-D)\le \alpha_\varkappa(A)+N(A-D)=Z_\varkappa(A), \tag{22} \]
which gives 2a). If \(m=n\), then \(\gamma_{\mathrm{III}}(D)=\alpha_\varkappa(A)\) and, consequently, (22) holds, which proves 2b).
Let \(m<n\), let \(\beta\) be a real number, and let the matrices \(D_\beta\) and \(\hat D\) be obtained from \(D_A\) by replacing, for \(j\in\varkappa\), the elements \(a_{jj}\) respectively by \(\beta\) and \(T_\varkappa(A)/m\). Then, if \(\beta\ge |a_{jj}|\) for \(j\notin\varkappa\), then \(\Phi_{\mathrm{III}}(D_\beta)\le \beta\), and \(\Phi_{\mathrm{III}}(A)\le \Phi_{\mathrm{III}}(D_\beta)+\Phi_{\mathrm{III}}(A-D_\beta)\le \beta+N(A-D_\beta)\), whence follows the upper of inequalities (20). If also \(\beta\ge |T_\varkappa(A)|/m\), then \(\Phi_{\mathrm{III}}(\hat D)\le \beta\) and
\[ \Phi_{\mathrm{III}}(A)\le \Phi_{\mathrm{III}}(\hat D)+\Phi_{\mathrm{III}}(A-\hat D)\le \beta+N(A-\hat D)=\beta+\Gamma_\varkappa(A). \]
Assertion 3) is proved. 4) is proved analogously.
-
Obviously, \(N_\varkappa(\tilde A)\le N_\varkappa(A)\), and the same is true for \(Z_\varkappa, T_\varkappa\); moreover, \(T_\varkappa(A)=T_\varkappa(\tilde A)\) and \(\alpha_\varkappa(A)\le \alpha_\varkappa(\tilde A)\). Therefore, when estimating \(\gamma_{\mathrm{III}}(A)\) by Theorem 3, it is recommended first to estimate \(\gamma_{\mathrm{III}}(\tilde A)\) by this theorem, and then to use the equality \(\gamma_{\mathrm{III}}(A)=\gamma_{\mathrm{III}}(\tilde A)\).
-
Theorem 4. \(\Phi_{\mathrm{III}}(A)\le \sqrt{Z(AA^*)}\) for \(A\in M\).
Proof. \(\rho(AA^*)=\sigma(AA^*)=\Upsilon_{\mathrm{III}}(AA^*)\). But, by Theorem 3 (26), \(\Upsilon_{\mathrm{III}}(AA^*)\ge Z(AA^*)\).
- Put
\[ P_k \stackrel{\mathrm{def}}{=} \sum_{\nu}' |a_{k\nu}|,\qquad Q_k \stackrel{\mathrm{def}}{=} \sum_{\mu}' |a_{\mu k}|. \]
Theorem 5. If \(0\le \alpha \le 1\), then
\[
\sigma(A)\le \max_k\bigl(\operatorname{Re} a_{kk}+P_k^\alpha Q_k^{1-\alpha}\bigr).
\tag{23}
\]
Proof. The following inequality of A. Ostrovskii is known (see (5), p. 151):
\[
\rho(A)\le \max_k\bigl(|a_{kk}|+P_k^\alpha Q_k^{1-\alpha}\bigr).
\tag{24}
\]
Replacing \(A\) in (24) by \(E_n+hA\) and applying (1), Theorem 1, we obtain (23).
- Example ((5), p. 148). The eigenvalues of the matrix
\[ A= \begin{bmatrix} 7+3i & -4-6i & -4\\ -1-6i & 7 & -2-6i\\ 2 & 4-6i & 13-3i \end{bmatrix} \]
are \(9,\ 9+9i,\ 9-9i\); \(\rho(A)=9\sqrt{2}\simeq 12.7\ (+)\), \(\sigma(A)=9\). In (5) several estimates for \(\rho(A)\) and \(\sigma(A)\) were obtained, the best of which are \(\rho(A)<22.05\) and \(\sigma(A)<22.05\). Theorem 4 and Theorem 3 (2a)) give, respectively,
\(\Phi_{\mathrm{III}}(A)\le \sqrt[4]{Z(AA^*)}<18.02\) and \(\Upsilon_{\mathrm{III}}(A)\le Z(A)<14.2\). Consequently,
\(\rho(A)<18.02,\ \sigma(A)<14.2\). Direct computation gives \(\Phi_{\mathrm{III}}(A)>17.24,\ \Upsilon_{\mathrm{III}}(A)=13.5\).
Received
28 III 1965
REFERENCES
¹ S. M. Lozinskii, DAN, 163, No. 4, 1965.
² D. K. Faddeev, V. N. Faddeeva, Computational Methods of Linear Algebra, Moscow–Leningrad, 1960.
³ S. M. Lozinskii, Izv. vyssh. uchebn. zaved., Mathematics, No. 5 (1958); No. 5 (1959).
⁴ J. Naas, H. L. Schmid, Mathematisches Wörterbuch, 2, Berlin—Leipzig, 1961.
⁵ M. Marcus, H. Minc, A Survey of Matrix Theory and Matrix Inequalities, Boston, 1964.