Von Neumann's Trace Inequalities

July 18, 2020

Trace inequality is important for matrix analysis. In this post we first discuss Von Neumann’s trace inequality and another related one, which can be viewed as a slightly more detailed version of the trace inequality Wiki page. Then we prove a related trace inequality for general Hermitian matrices.

In 1937, Von Neumann proved the following trace inequality 1, where the trace of the product of two general matrices is upper bounded by the sum of product of their singular values in order.

Theorem (Von Neumann’s trace inequality). If $A, B$ are complex $n \times n$ matrices with singular values,

\[\alpha_1 \geq \cdots \geq \alpha_n, \quad \beta_1 \geq \cdots \geq \beta_n,\]

respectively, then

\[\lvert \mathrm{tr}(AB) \rvert \leq \sum_{i=1}^n \alpha_i \beta_i.\]

Later in 1970, Ruhe proved a related trace inequality for the product of positive semidefinite matrices 2. Importantly, both the upper and the lower bounds are provided. The original statement is for $A^* A$ and $B^* B$ for general complex matrices $A \in \mathbb{C}^{m\times n}$ and $B \in \mathbb{C}^{n\times p}$. Here we restate the lemma with positive semidefinite Hermitian matrices.

Lemma 1 (Ruhe’s trace inequality). If $A, B$ are $n \times n$ positive semidefinite Hermitian matrices with eigenvalues,

\[a_1 \geq \cdots \geq a_n \geq 0, \quad b_1 \geq \cdots \geq b_n \geq 0,\]

respectively, then

\[\sum_{i=1}^n a_i b_{n-i+1} \leq \mathrm{tr}(AB) \leq \sum_{i=1}^n a_i b_i.\]

Generalized version of both Theorem and Lemma 1 are proved by Marshall et al. 3. In the following, we use $\lambda_i(\cdot)$ and $\sigma_i(\cdot)$ to denote the $i$-th eigenvalue and singular value of a matrix. Both are of descending order.

Lemma 2 If $A, B$ are complex $n \times n$ matrices, then for any $k = 1, \dots, n$,

\[\sum_{i=1}^k \sigma_i(AB) \leq \sum_{i=1}^k \sigma_i(A) \sigma_i(B).\]

Lemma 3 If $A, B$ are $n \times n$ positive semidefinite Hermitian matrices, then for any $k = 1, \dots, n$,

\[\sum_{i=1}^k \lambda_i(A) \lambda_{n-i+1}(B) \leq \sum_{i=1}^k \lambda_i(AB)\]

Although Lemma 3 only generalizes lower bound part of Lemma 1, the other inequality also holds under the same generalization.

Lemma 1 for general Hermitian matrices

In (Marshall et al. 2011) 3, it is mentioned the upper bound part of Lemma 1 can be generalized to general Hermitian matrices. Here we rigorously prove that the positive semidefinite assumption can be removed and both inequalities in Lemma 1 still hold.

Lemma 4 (Generalized Ruhe’s trace inequality). If $A, B$ are $n \times n$ Hermitian matrices, then

\[\sum_{i=1}^n \lambda_i(A) \lambda_{n-i+1}(B) \leq \mathrm{tr}(AB) \leq \sum_{i=1}^n \lambda_i(A) \lambda_i(B).\]

Proof. Since Hermitian matrices have unitary eigenvectors, we denote the eigen-decomposition of $A$ as $A = Q \Lambda Q^\star$ where $Q$ is a unitary matrix. The trace of $AB$ is then, \(\begin{equation} \mathrm{tr}(AB) = \mathrm{tr}(Q \Lambda Q^\star B) = \mathrm{tr}(\Lambda Q^\star BQ), \end{equation}\) where $Q^\star BQ$ and $B$ share the same eigenvalues.

Therefore, without loss of generality, we can assume that $A$ is a diagonal matrix with diagonal entries being eigenvalues, $\lambda_1(A), \dots, \lambda_n(A)$. Then we have,

\[\begin{equation} \label{eq:trace} \mathrm{tr}(AB) = \sum_{i=1}^n \lambda_i(A) B_{ii}. \end{equation}\]

We use induction over $n$ to prove the lemma. For $n=1$, both inequalities become equality and hold. Now we assume that both inequalities are true for all $(n-1) \times (n-1)$ matrices.

We reorganize the trace \eqref{eq:trace} as follows,

\[\begin{split} \mathrm{tr}(AB) = & \sum_{i=1}^{n-1} (\lambda_i(A) - \lambda_n(A)) B_{ii} + \sum_{i=1}^n \lambda_n(A) B_{ii}\\ = & \mathrm{tr}(\tilde{\Lambda} \tilde{B}) + \lambda_n(A) \mathrm{tr}(B)\\ \end{split} ,\]

where $\tilde{\Lambda}$ is an $(n-1)\times (n-1)$ diagonal matrix with diagonal entries being $\lambda_i(A) - \lambda_n(A)$, and $\tilde{B}$ is the first $(n-1) \times (n-1)$ principal submatrix of $B$.

By Cauchy Interlacing Theorem, we have the following relations for eigenvalues of $B$ and $\tilde{B}$,

\[\begin{equation} \label{eq:interlacing} \lambda_{i+1}(B) \leq \lambda_i(\tilde{B}) \leq \lambda_i(B). \end{equation}\]

Applying the induction hypothesis on $\mathrm{tr}(\tilde{\Lambda} \tilde{B})$, we first show the upper bound part,

\[\begin{split} \mathrm{tr}(AB) \leq & \sum_{i=1}^{n-1} (\lambda_i(A) - \lambda_n(A)) \lambda_i(\tilde{B}) + \lambda_n(A) \sum_{i=1}^n \lambda_i(B)\\ \leq & \sum_{i=1}^{n-1} (\lambda_i(A) - \lambda_n(A)) \lambda_i(B) + \lambda_n(A) \sum_{i=1}^n \lambda_i(B)\\ = & \sum_{i=1}^n \lambda_i(A) \lambda_i(B) \end{split} ,\]

where the second inequality adopts positivity of $\lambda_i(A) - \lambda_n(A)$ and the interlacing theorem \eqref{eq:interlacing}.

Similarly, we can show the lower bound part as,

\[\begin{split} \mathrm{tr}(AB) \geq & \sum_{i=1}^{n-1} (\lambda_i(A) - \lambda_n(A)) \lambda_{n-i}(\tilde{B}) + \lambda_n(A) \sum_{i=1}^{n} \lambda_{n-i+1}(B)\\ \geq & \sum_{i=1}^{n-1} (\lambda_i(A) - \lambda_n(A)) \lambda_{n-i+1}(B) + \lambda_n(A) \sum_{i=1}^{n} \lambda_{n-i+1}(B)\\ = & \sum_{i=1}^n \lambda_i(A) \lambda_{n-i+1}(B) \end{split} .\]

Hence we proved the lemma. \(\tag*{$\blacksquare$}\)

The conditions for the equalities in Lemma 4 are very restrictive. First, $A$ and $B$ must commute so that they have the same eigenvectors and the trace of product becomes the sum of products of eigenvalues. In addition to the commutative property, the ordering of eigenvalues also matters. When the eigenvalues of $A$ are decreasingly ordered, the corresponding eigenvalues of $B$ must be also of decreasing order to satisfy the equality of the upper bound part, and of increasing order to satisfy the equality of the lower bound part.

 

Yingzhou Li

 


Reference

  1. Von Neumann, J., Some matrix-inequalities and metrization of matrix-space, Tomsk Univ. Rev. 1, 286-300 (1937). Reprinted in Collected Works (Pergamon Press, 1962), iv, 205-219. 

  2. Ruhe, A., Perturbation bounds for means of eigenvalues and invariant subspaces, Nordisk Tidskr. Informationsbehandling (BIT) 10, 343-354 (1970). 

  3. Marshall, A. W.; Olkin, I.; Arnold, B., Inequalities: Theory of Majorization and Its Applications (2nd ed.). New York: Springer. p. 340-341 (2011).  2

Discussion

I'm a faculty at Fudan University. Follow me on Github and Bitbucket. Hopefully, you will find my code useful. You are also welcome to either email me or post comments below to discuss on these posts.