Probabilistic-Numerics.org

# Probabilistic Integration

### By François-Xavier Briol, 2015-12-03 10:00:00 +0100

Recent months have been very exciting for the probabilistic numerics community, with a series of new interesting papers appearing and a scoping workshop at the Alan Turing Institute discussing avenues for future research. In particular, there has been a lot of interest in solving differential equations, which was the topic of two workshops at the University of Warwick and at SciCADE, but the focus is now slightly shifting towards quadrature with two workshops on Probabilistic Integration (not to be missed!) at NIPS 2015 in December and MCMSki V in January.

Probabilistic Integration, and, in particular, Bayesian Quadrature (BQ), was one of the first areas to be investigated in probabilistic numerics. The overall idea is to fully handle our epistemic uncertainty over the numerical solution of the integral; this can, of course, be obtained by adopting a Bayesian approach. More precisely, we can model the integrand using a Gaussian Process and, using the linearity of Gaussian variables, obtain a Gaussian posterior distribution over the solution of the integral. We then usually consider the posterior mean of this Gaussian distribution to be our approximation of the integral, while the posterior variance can help us understand how uncertain we are about our solution. This method has been extended numerous times in order to improve its applicability to statistical tasks, such as computing ratios of integrals, or considering cases where the integrand is a probability distribution and hence intrinsically non-negative.

Recent work has focussed on studying the links between BQ and other methods. For example, (Särkkä, Hartikainen, Svensson, & Sandblom, 2015), discussed how many popular quadrature methods can be obtained as the posterior mean of a BQ method with a specific kernel. Similarly, (Bach, 2015) showed that performing BQ is equivalent to using a specific type of random features.

On the other hand, existing methods can, and sometimes should, also help to design more efficient BQ methods. In a recent paper by (Briol, Oates, Girolami, & Osborne, 2015), the authors showed how a convex optimisation method, called the Frank-Wolfe algorithm, could provide a probabilistic solution to integration problems. This method also provided the first ever provable posterior convergence and contraction rates for BQ. Follow-up work (Briol, Oates, Girolami, Osborne, & Sejdinovic, 2015) also demonstrated that similar theory could be obtained when using Monte Carlo methods (such as Markov Chain or Quasi Monte Carlo) to pick the points at which the integrand is evaluated. This paper also provides rates for the newly proposed methods, as well as a framework for obtaining rates of any new future BQ methods.

Clearly, much remains to be done to be able to provide similar theoretical guarantees to alternative methods, but these early successes are very promising and the increasing popularity of workshops in probabilistic numerics clearly demonstrates the interest from potential users!

1. Briol, F.-X., Oates, C. J., Girolami, M., Osborne, M. A., & Sejdinovic, D. (2015). Probabilistic Integration: A Role for Statisticians in Numerical Analysis? ArXiv:1512.00933 [Cs, Math, Stat]. Retrieved from http://arxiv.org/abs/1512.00933

A research frontier has emerged in scientific computation, founded on the principle that numerical error entails epistemic uncertainty that ought to be subjected to statistical analysis. This viewpoint raises several interesting challenges, including the design of statistical methods that enable the coherent propagation of probabilities through a (possibly deterministic) computational pipeline. This paper examines thoroughly the case for probabilistic numerical methods in statistical computation and a specific case study is presented for Markov chain and Quasi Monte Carlo methods. A probabilistic integrator is equipped with a full distribution over its output, providing a measure of epistemic uncertainty that is shown to be statistically valid at finite computational levels, as well as in asymptotic regimes. The approach is motivated by expensive integration problems, where, as in krigging, one is willing to expend, at worst, cubic computational effort in order to gain uncertainty quantification. There, probabilistic integrators enjoy the "best of both worlds", leveraging the sampling efficiency of Monte Carlo methods whilst providing a principled route to assessment of the impact of numerical error on scientific conclusions. Several substantial applications are provided for illustration and critical evaluation, including examples from statistical modelling, computer graphics and uncertainty quantification in oil reservoir modelling.

@article{briol_probabilistic_2015,
title = {Probabilistic {Integration}: A Role for Statisticians in Numerical Analysis?},
url = {http://arxiv.org/abs/1512.00933},
urldate = {2015-07-22},
journal = {arXiv:1512.00933 [cs, math, stat]},
author = {Briol, François-Xavier and Oates, Chris J. and Girolami, Mark and Osborne, Michael A. and Sejdinovic, Dino},
year = {2015},
note = {arXiv: 1512.00933},
file = {http://arxiv.org/pdf/1512.00933.pdf}
}

2. Bach, F. (2015). On the Equivalence between Quadrature Rules and Random Features. ArXiv Preprint ArXiv:1502.06800.

We show that kernel-based quadrature rules for computing in tegrals can be seen as a special case of random feature expansions for positive definite kernels, for a particular decomposition that always exists for such kernels. We provide a theoretical analysis of the number of required samples for a given approximation error, leading to both upper and lower bounds that are based solely on the eigenvalues of the associated integral operator and match up to logarithmic terms. In particular, we show that the upper bound may be obtained from independent an d identically distributed samples from a specific non-uniform distribution, while the lower bo und if valid for any set of points. Applying our results to kernel-based quadrature, while our results are fairly general, we recover known upper and lower bounds for the special cases of Sobolev spaces. Moreover, our results extend to the more general problem of full function approxim ations (beyond simply computing an integral), with results in L2- and L∞-norm that match known results for special cases. Applying our results to random features, we show an improvement of the number of random features needed to preserve the generalization guarantees for learning with Lipshitz-continuous losses.

@article{bach2015equivalence,
title = {On the Equivalence between Quadrature Rules and Random Features},
author = {Bach, Francis},
journal = {arXiv preprint arXiv:1502.06800},
file = {http://arxiv.org/pdf/1502.06800v2.pdf},
year = {2015}
}

3. Briol, F.-X., Oates, C. J., Girolami, M., & Osborne, M. A. (2015). Frank-Wolfe Bayesian Quadrature: Probabilistic Integration with Theoretical Guarantees. In Advances in Neural Information Processing Systems (NIPS). Retrieved from http://arxiv.org/abs/1506.02681

There is renewed interest in formulating integration as an inference problem, motivated by obtaining a full distribution over numerical error that can be propagated through subsequent computation. Current methods, such as Bayesian Quadrature, demonstrate impressive empirical performance but lack theoretical analysis. An important challenge is to reconcile these probabilistic integrators with rigorous convergence guarantees. In this paper, we present the first probabilistic integrator that admits such theoretical treatment, called Frank-Wolfe Bayesian Quadrature (FWBQ). Under FWBQ, convergence to the true value of the integral is shown to be exponential and posterior contraction rates are proven to be superexponential. In simulations, FWBQ is competitive with state-of-the-art methods and out-performs alternatives based on Frank-Wolfe optimisation. Our approach is applied to successfully quantify numerical error in the solution to a challenging model choice problem in cellular biology.

@inproceedings{briol_frank-wolfe_2015,
title = {Frank-{Wolfe} {Bayesian} {Quadrature}: {Probabilistic} {Integration} with {Theoretical} {Guarantees}},
url = {http://arxiv.org/abs/1506.02681},
booktitle = {Advances in Neural Information Processing Systems (NIPS)},
author = {Briol, François-Xavier and Oates, Chris J. and Girolami, Mark and Osborne, Michael A.},
year = {2015},
keywords = {Statistics - Machine Learning},
file = {http://arxiv.org/pdf/1506.02681v1.pdf}
}

4. Särkkä, S., Hartikainen, J., Svensson, L., & Sandblom, F. (2015). On the relation between Gaussian process quadratures and sigma-point methods. ArXiv Preprint Stat.ME 1504.05994.

This article is concerned with Gaussian process quadratures, which are numerical integration methods based on Gaussian process regression methods, and sigma-point methods, which are used in advanced non-linear Kalman filtering and smoothing algorithms. We show that many sigma-point methods can be interpreted as Gaussian quadrature based methods with suitably selected covariance functions. We show that this interpretation also extends to more general multivariate Gauss–Hermite integration methods and related spherical cubature rules. Additionally, we discuss different criteria for selecting the sigma-point locations: exactness for multivariate polynomials up to a given order, minimum average error, and quasi-random point sets. The performance of the different methods is tested in numerical experiments.

@article{2015arXiv150405994S,
author = {{S{\"a}rkk{\"a}}, S. and {Hartikainen}, J. and {Svensson}, L. and {Sandblom}, F.},
title = {{On the relation between Gaussian process quadratures and sigma-point methods}},
journal = {arXiv preprint stat.ME 1504.05994},
year = {2015},
month = apr,
file = {http://arxiv.org/pdf/1504.05994v1.pdf}
}