This page collects important ongoing areas of research. Most of these issues have previously been discussed at a community meeting or workshop. Of course, this list is necessarily simplified and incomplete. If you feel that a central problem is missing on this list and is addressed by your published research, please contact us.
More references can be found on the literature page.
There are some principal questions that apply more or less directly to all numerical problems. Chief among them is a philosophical question of what, precisely, it means to assign a probability distribution over the result of a computation. See Mike Osborne’s recent post on this issue, which summarizes the discussion at the recent roundtable.
Another ongoing issue is to identify and champion good areas of application. Probabilistic numerical methods, in principle, allow the quantification, propagation, and control of computational precision in large computational pipelines. Development of a good code-base for this kind of functionality will proceed as the following fundamental methodological questions are slowly solved in the individual problem areas.
Each type of numerical problem poses special challenges that have to be addressed by tailored solutions. Often, intended application areas are sources of specific computational challenges.
Quadrature is the problem of inferring the integral over a function . Typically, is only accessible in form of a function handle. Probabilistic quadrature rules usually model using a Gaussian process measure. An important early work on Bayesian Quadrature is (missing reference)
Important research questions in Bayesian Quadrature currently include
how to achieve sufficiently low computational cost to be wall-clock time competitive with MCMC methods.
how to efficiently adapt model parameters to achieve well-calibrated uncertainty. Ideally, this should involve optimization across covariance function families.
how to build efficient quadrature methods for specific applications, most importantly marginalization in probabilistic models. See (Osborne et al., 2012) for recent work.
Linear algebra methods solve matrix computations. Of particular interest in this area is the solution of linear problems of the type , where is a matrix that can not be directly represented in memory, is a known vector, and the solution needs to be found. (Hennig, 2015) recently showed that the linear method of conjugate gradients can be interpreted as Gaussian regression under specific priors and likelihoods (the same holds for the BFGS method, which is equivalent to CG in the linear case). Important next steps include
good estimators for hyperparameters that give well-calibrated posteriors (meaningful error bars on matrix inverses).
expanding the probabilistic interpretation to a larger class of linear algorithms.
Optimization aims to find extremal values of a function . At the moment, research on probabilistic numerical optimization methods focusses on the optimization of twice-continuously differentiable functions , where is typically a large number. (Hennig & Kiefel, 2012) showed that the Dennis family of Quasi-Newton methods (which includes BFGS and DFP) can be interpreted as a specific kind of Gaussian regression. Ongoing work includes
finding good probabilistic optimization methods for optimization of noisy objectives (this requires solving a number of challenging sub-problems), as they arise, for example, in the training of big-data machine learning algorithms.
expanding the probabilistic interpretation to a larger class of optimization algorithms.
Ordinary differential equations are equations of the form . Classic ODE setups require the specification of explicit initial or boundary value conditions, i.e. knowledge of the form or similar. One interesting long-term question for probabilistic solvers for ODEs is whether they can deal with problems in which boundary values are only known up to uncertainty. An early work on probabilistic methods for the solution of ODEs was offered by Skilling in 1991 (Skilling, 1991), using a Gaussian process extrapolation formulation. Recently, (Chkrebtii et al., 2013) provided a more rigorous theoretical analysis, while (Hennig & Hauberg, 2014) showed the utility of such methods for the propagation and control of uncertainty. Even more recently, (missing reference) showed a connection between Gaussian extrapolators and classic Runge-Kutta methods. Many open questions remain. Among them are
theoretical analysis, and better models, for the calibration of uncertainty in ODE solvers. Calibration is particularly hard in the ODE domain, because ODE solvers recursively construct new observations from their current belief. This “boot-strapping” leads to a systematic error in error estimation. A corresponding problem exists in classic methods (known their as the mis-match between local and global error estimation).
better understanding for the algebraic structure and constraints of GP ODE solvers. A very deep literature exists for classic solvers such as the Runge-Kutta family. Currently, the probabilistic view is a lot less solid.
Partial differential equations are perhaps a particularly interesting area for the foundations of probabilistic numerics, because they are home to some of the oldest mathematical results on connections between probability distributions and the result of deterministic mathematical problems. In a famous paper, (Kac, 1949) showed that the solution to certain types of PDEs is exactly equal to the expected path of certain stochastic (Wiener) processes. This Feynman-Kac theorem is perhaps the first explicit note of a precise connection between a stochastic object and a deterministic computation. Recently, (Bui-Thanh & Girolami, 2014) used the Feynman-Kac formulation to construct a fast solver for PDEs based on a Markov-Chain Monte Carlo scheme.
This paper proposes a probabilistic framework for algorithms that iteratively solve unconstrained linear problems Bx = b with positive definite B for x. The goal is to replace the point estimates returned by existing methods with a Gaussian posterior belief over the elements of the inverse of B, which can be used to estimate errors. Recent probabilistic interpretations of the secant family of quasi-Newton optimization algorithms are extended. Combined with properties of the conjugate gradient algorithm, this leads to uncertainty-calibrated methods with very limited cost overhead over conjugate gradients, a self-contained novel interpretation of the quasi-Newton and conjugate gradient algorithms, and a foundation for new nonlinear optimization methods.
@article{2014arXiv14022058H, author = {{Hennig}, P.}, journal = {SIAM J on Optimization}, month = jan, title = {{Probabilistic Interpretation of Linear Solvers}}, year = {2015}, link = {http://epubs.siam.org/doi/abs/10.1137/140955501?journalCode=sjope8}, volume = {25}, issue = {1}, file = {http://probabilistic-numerics.org/assets/pdf/HennigLinear2015.pdf} }
We study a probabilistic numerical method for the solution of both boundary and initial value problems that returns a joint Gaussian process posterior over the solution. Such methods have concrete value in the statistics on Riemannian manifolds, where non-analytic ordinary differential equations are involved in virtually all computations. The probabilistic formulation permits marginalising the uncertainty of the numerical solution such that statistics are less sensitive to inaccuracies. This leads to new Riemannian algorithms for mean value computations and principal geodesic analysis. Marginalisation also means results can be less precise than point estimates, enabling a noticeable speed-up over the state of the art. Our approach is an argument for a wider point that uncertainty caused by numerical calculations should be tracked throughout the pipeline of machine learning algorithms.
@inproceedings{HennigAISTATS2014, author = {Hennig, Philipp and Hauberg, S{\o}ren}, booktitle = {{Proc. of the 17th int. Conf. on Artificial Intelligence and Statistics ({AISTATS})}}, publisher = {JMLR, W\&CP}, title = {{Probabilistic Solutions to Differential Equations and their Application to Riemannian Statistics}}, volume = {33}, year = {2014}, file = {http://jmlr.org/proceedings/papers/v33/hennig14.pdf}, video = {https://www.youtube.com/watch?v=fLCS0KXmiXs}, code = {http://www.probabilistic-numerics.org/GP_ODE_Solver.zip}, supplements = {http://www.probabilistic-numerics.org/22-supp.zip} }
We consider the Riemann manifold Hamiltonian Monte Carlo (RMHMC) method for solving statistical inverse problems governed by partial differential equations (PDEs). The power of the RMHMC method is that it exploits the geometric structure induced by the PDE constraints of the underlying inverse problem. Consequently, each RMHMC posterior sample is almost independent from the others providing statistically efficient Markov chain simulation. We reduce the cost of forming the Fisher information matrix by using a low rank approximation via a randomized singular value decomposition technique. This is efficient since a small number of Hessian-vector products are required. The Hessian-vector product in turn requires only two extra PDE solves using the adjoint technique. The results suggest RMHMC as a highly efficient simulation scheme for sampling from PDE induced posterior measures.
@article{2014arXiv14071517B, author = {{Bui-Thanh}, T. and {Girolami}, M.}, title = {{Solving Large-Scale PDE-constrained Bayesian Inverse Problems with Riemann Manifold Hamiltonian Monte Carlo}}, journal = {arXiv pre-print 1407.1517}, keywords = {Mathematics - Statistics Theory}, year = {2014}, month = jul, link = {http://arxiv.org/abs/1407.1517} }
This paper advocates expansion of the role of Bayesian statistical inference when formally quantifying uncertainty in computer models defined by systems of ordinary or partial differential equations. We adopt the perspective that implicitly defined infinite dimensional functions representing model states are objects to be inferred probabilistically. We develop a general methodology for the probabilistic integration of differential equations via model based updating of a joint prior measure on the space of functions and their temporal and spatial derivatives. This results in a posterior measure over functions reflecting how well they satisfy the system of differential equations and corresponding initial and boundary values. We show how this posterior measure can be naturally incorporated within the Kennedy and O’Hagan framework for uncertainty quantification and provides a fully Bayesian approach to model calibration. By taking this probabilistic viewpoint, the full force of Bayesian inference can be exploited when seeking to coherently quantify and propagate epistemic uncertainty in computer models of complex natural and physical systems. A broad variety of examples are provided to illustrate the potential of this framework for characterising discretization uncertainty, including initial value, delay, and boundary value differential equations, as well as partial differential equations. We also demonstrate our methodology on a large scale system, by modeling discretization uncertainty in the solution of the Navier-Stokes equations of fluid flow, reduced to over 16,000 coupled and stiff ordinary differential equations. Finally, we discuss the wide range of open research themes that follow from the work presented.
@article{13_bayes_uncer_quant_differ_equat, author = {Chkrebtii, O. and Campbell, D.A. and Girolami, M.A. and Calderhead, B.}, journal = {Bayesin Analysis (discussion paper)}, title = {{B}ayesian Uncertainty Quantification for Differential Equations}, year = {2013}, pages = {in press}, link = {http://arxiv.org/abs/1306.2365} }
Numerical integration is a key component of many problems in scientific computing, statistical modelling, and machine learning. Bayesian Quadrature is a model-based method for numerical integration which, relative to standard Monte Carlo methods, offers increased sample efficiency and a more robust estimate of the uncertainty in the estimated integral. We propose a novel Bayesian Quadrature approach for numerical integration when the integrand is non-negative, such as the case of computing the marginal likelihood, predictive distribution, or normalising constant of a probabilistic model. Our approach approximately marginalises the quadrature model’s hyperparameters in closed form, and introduces an ac- tive learning scheme to optimally select function evaluations, as opposed to using Monte Carlo samples. We demonstrate our method on both a number of synthetic benchmarks and a real scientific problem from astronomy.
@inproceedings{osborne2012active, author = {Osborne, M.A. and Duvenaud, D.K. and Garnett, R. and Rasmussen, C.E. and Roberts, S.J. and Ghahramani, Z.}, booktitle = {{Advances in Neural Information Processing Systems (NIPS)}}, pages = {46--54}, title = {{Active Learning of Model Evidence Using Bayesian Quadrature.}}, year = {2012}, file = {../assets/pdf/Osborne2012active.pdf } }
Four decades after their invention, quasi- Newton methods are still state of the art in unconstrained numerical optimization. Al- though not usually interpreted thus, these are learning algorithms that fit a local quadratic approximation to the objective function. We show that many, including the most popular, quasi-Newton methods can be interpreted as approximations of Bayesian linear regression under varying prior assumptions. This new notion elucidates some shortcomings of clas- sical algorithms, and lights the way to a novel nonparametric quasi-Newton method, which is able to make more efficient use of available information at computational cost similar to its predecessors.
@inproceedings{HennigKiefel, author = {Hennig, P. and Kiefel, M.}, booktitle = {{International Conference on Machine Learning (ICML)}}, title = {{Quasi-{N}ewton methods -- a new direction}}, year = {2012}, video = {http://techtalks.tv/talks/quasi-newton-methods-a-new-direction/57289/}, file = {../assets/pdf/hennig13quasiNewton.pdf} }
In the numerical solution of ordinary differential equations, a function y(x) is to be reconstructed from knowledge of the functional form of its derivative: dy/dx=f(x,y), together with an appropriate boundary condition. The derivative f is evaluated at a sequence of suitably chosen points (x_k,y_k), from which the form of y(.) is estimated. This is an inference problem, which can and perhaps should be treated by Bayesian techniques. As always, the inference appears as a probability distribution prob(y(.)), from which random samples show the probabilistic reliability of the results. Examples are given.
@article{skilling1991bayesian, author = {Skilling, J.}, journal = {Maximum Entropy and Bayesian Methods, Seattle}, title = {{Bayesian solution of ordinary differential equations}}, year = {1991} }
@article{kac1949distributions, title = {On distributions of certain Wiener functionals}, author = {Kac, M.}, journal = {Transactions of the AMS}, volume = {65}, number = {1}, pages = {1--13}, year = {1949}, link = {http://www.jstor.org/stable/1990512} }