You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Stochastic trace estimation for parameter-dependent matrices applied to spectral density approximation
Abstract
Stochastic trace estimation is a well-established tool for approximating the trace of a large symmetric positive semi-definite matrix $\boldsymbol{B}$. Several applications involve a matrix that depends continuously on a parameter $t \in [a,b]$, and require trace estimates of $\boldsymbol{B}(t)$ for many different values of $t$. This is, for example, the case when approximating the spectral density of a matrix or computing partition functions of quantum systems. Approximating the trace separately for each matrix $\boldsymbol{B}(t_1), \dots, \boldsymbol{B}(t_m)$ clearly incurs redundancies and a cost that scales linearly with $m$. To address this issue, we propose and analyze modifications for three stochastic trace estimators, the Girard-Hutchinson, Nyström, and Nyström++ estimators. Our modification uses \emph{constant} randomization across different values of $t$, that is, every matrix $\boldsymbol{B}(t_1), \dots, \boldsymbol{B}(t_m)$ is multiplied with the \emph{same} set of random vectors. When combined with Chebyshev approximation in $t$, the use of such constant random matrices allows one to reuse matrix-vector products across different values of $t$, leading to significant cost reduction. Our analysis shows that the loss of stochastic independence across different $t$ does not lead to deterioration; in fact, our error bounds for these estimators closely match existing error bounds for the constant case. In particular, we show that $\mathcal{O}(\varepsilon^{-1})$ random matrix-vector products suffice to ensure an error of $\varepsilon > 0$ for Nyström++, independent of low-rank properties of $\boldsymbol{B}(t)$. We discuss in detail how the combination of Nyström++ with Chebyshev approximation applies to large-scale spectral density estimation and provide an analysis of the resulting method. This improves various aspects of an existing stochastic estimator for spectral density estimation. Several numerical experiments for examples from electronic structure interaction, statistical thermodynamics, and neural network optimization illustrate and validate our findings.
Reproducing the whole project might around one hour!
Tests
To run the tests written for the algorithms developed and used in the paper, you will need to install pytest and run the command pytest at the root of this project with the commands
python -m pip install pytest
pytest
Paper overview
We consider parameter-dependent matrices of the form
where $b_{ij}(t)$ are functions depending continuously on the parameter $t$ which takes values in the interval $[a,b]$. The trace of such a matrix is defined as
However, we assume that we only have access to products of this matrix with vectors for each $t \in [a, b]$, so this definition will not be directly useful for computing the trace.
Girard-Hutchinson estimator
We can approximate the trace with the Girard-Hutchinson estimator: We take $n_{\boldsymbol{\Psi}}$ stochastically independent Gaussian random vectors $\boldsymbol{\psi}_1,\dots, \boldsymbol{\psi}_{n_{\boldsymbol{\Psi}}} \in \mathbb{R}^{n}$ to form
Alternatively, the trace of a symmetric matrix whose singular values decay quickly can be approximated well by using a Gaussian sketching matrix $\boldsymbol{\Omega} \in \mathbb{R}^{n \times n_{\boldsymbol{\Omega}}}$ to form the Nyström approximation
Then we can estimate the trace as $\mathrm{Tr}(\boldsymbol{B}_{\boldsymbol{\Omega}}(t))$. Thanks to the invariance of the trace under cyclic permutation of its arguments and the symmetry of the matrix, we may rewrite this estimator as
Finally, an estimator which corrects for inaccuracies in the Nyström approximation by estimating the trace of its residual using the Girard-Hutchinson estimator is
This is the parameter-dependent analogue of the Nyström++ estimator, which is based on the Hutch++ estimator.
Chebyshev-Nyström++ method
The smoothed spectral density of a real symmetric matrix $\boldsymbol{A} \in \mathbb{R}^{n \times n}$ with eigenvalues $\lambda_1, \dots, \lambda_n \in \mathbb{R}$ is defined as
which allows us to apply any of the above presented stochastic trace estimators to the parameter dependent matrix $g_{\sigma}(t \boldsymbol{I}_n - \boldsymbol{A})$. To evaluate $g_{\sigma}(t \boldsymbol{I}_n - \boldsymbol{A})$, we approximate $g_{\sigma}$ with its degree $m$ Chebyshev interpolant
whose coefficients $\mu_{l}(t)$ we can compute through a discrete cosine transform.
Finally the Chebyshev-Nyström++ estimator approximates the smoothed spectral density by applying the Nyström++ approximation to the Chebyshev interpolant
parameter-trace
│ README.md (file you are reading right now)
| requirements.txt (Python package requirements file)
| reproduce.py (script for easy reproduction of project)
|
└───paper (the LaTeX project for the paper)
└───reproduce (scripts which help setup and reproduce project)
└───algorithms (the algorithms introduced in the paper)
└───matrices (the example matrices used for the numerical results)
└───test (unit tests written for the algorithms)