Large deviations for Markov chain Monte Carlo methods
Seminar in Insurance and Economics
SPEAKER: Pierre Nyquist (KTH Royal Institute of Technology).
TITLE: Large deviations for Markov chain Monte Carlo methods: the surprisingly curious case of the Metropolis-Hastings algorithm.
ABSTRACT: Markov chain Monte Carlo (MCMC) methods have become the workhorse for numerical computations in a range of scientific disciplines, e.g., computational chemistry and physics, statistics, and machine learning. The performance of MCMC methods has therefore become an important topic within (applied probability) and statistics. In particular, when the underlying distribution one is trying to sample from becomes sufficiently complex, convergence speed and/or the cost per iteration becomes an issue for most MCMC methods. The analysis, and subsequently design, of MCMC methods has to a large degree relied on classical tools used to determine the speed of convergence of Markov chains, e.g., mixing times, spectral gap and functional inequalities (Poincaré, log-Sobolev). An alternative avenue is to use the theory of large deviations for empirical measures. In this talk I will first give a general outline of this approach to analysing MCMC methods, along with some recent examples. I will then consider the specific case of the Metropolis-Hastings algorithm, the most classical amongst all MCMC methods and a foundational building block for many more advanced methods. Despite the simplicity of this method, it turns out that the theoretical analysis of it is still a rich area, and from the large deviation perspective it is surprisingly difficult to treat. As a first step we show a large deviation principle for the underlying Markov chain, extending the celebrated Donsker-Varadhan theory. Time permitted I will also discuss ongoing and future work on using this result for better understanding both the Metropolis-Hastings method and more advanced methods, such as approximate Bayesian computation (ABC-MCMC) and the Metropolis-adjusted Langevin algorithm (MALA).
The talk is primarily based on join work with Federica Milinanni (KTH). It will be entirely self-contained and no previous knowledge of the theory of large deviations is necessary.