For one of our projects, we needed to fit a statistical model with 50 parameters using >10,000 data points. Because many of the parameters were highly correlated in the posterior, MCMC sampling was slow. While waiting for our Markov chains to converge, we started exploring strategies for accelerating MCMC sampling and using approximate inference to explore challenging posteriors. Pathfinder.jl was born!
Inference is slow
In probabilistic programming we construct a posterior distribution represented by a log density function. Often the distribution is analyzed through either drawing samples using Markov Chain Monte Carlo (MCMC) or by fitting a distribution to it using variational inference and then studying that variational approximation.
Probabilistic models often have computational problems or make poor assumptions, which sometimes manifest in poor MCMC sampling performance. By analyzing the MCMC draws, one may be able to identify the model problems and improve them. However, problematic models are precisely the models that often are slow to sample with MCMC, so much time is lost waiting for the MCMC warm-up phase to end in order to get those draws.
The folk theorem of statistical computing:
When you have computational problems, often thereās a problem with your model.
Pathfinder.jl to the rescue!
Pathfinder is a variational method for approximating a posterior distribution that is often much faster than MCMC warm-up. It can be used to get initial draws to diagnose problems with a model, to find an initial point from which to initialize MCMC sampling, or to even replace the warm-up phase of Hamiltonian Monte Carlo (HMC).
We wrote Pathfinder.jl, a Julia implementation of Pathfinder that can be used with any Julia probabilistic programming language. It integrates especially well with AdvancedHMC.jl, DynamicHMC.jl, Turing.jl, and Stan.jl (via the StanLogDensityProblems.jl interface). By making extensive use of Julia interface packages, it also facilitates further research into variants of Pathfinder.
Pathfinder.jl in the wild
Much has happened since we originally implemented Pathfinder.jl! BlackJAX and Stan now both include implementations of Pathfinder among their inference methods. Stanās also has a Julia interface (only Stan models supported). While these are all excellent implementations, Pathfinder.jl is still the most extensible implementation and the most suitable for experimentation.
As an example, Tarek and Huang (2022) used Pathfinder.jl to replace the L-BFGS optimizer at the core of Pathfinder with IPOPT, which keeps a record of previous solutions. This allowed multi-path Pathfinder to perfectly fit simple Gaussian mixture models.
Pathfinder.jl seems to be especially useful for astrophysical applications. For example, itās been used for accelerating inference while imaging black holes, emulating the cosmic microwave background, and modeling orbits of exoplanets.
If youāre interested in using Pathfinder, weāve provided some examples to get you started at in the package documentation. Let us know how it works for your models!
References
- Lu Zhang, Bob Carpenter, Andrew Gelman, Aki Vehtari (2021). Pathfinder: Parallel quasi-Newton variational inference. arXiv: 2108.03782 [stat.ML]. Code
- https://github.com/mlcolab/Pathfinder.jl