Backstory
“Bayesian calibration of computer models” is the title of the seminal paper by Kennedy & O’Hagan (KOH) from 2001. In the years since publication the methodology has been applied to many domains from building energy models to radiative shock hydrodynamic models. Succinctly, Bayesian calibration of computer models seeks to estimate and quantify the uncertainty of the tunable input parameters (also known as calibration parameters) to a computer simulation model by comparing the simulator’s output against real observations. KOH’s success is driven by the model’s ability to account for many sources of uncertainty inherent in this parameter estimation problem.
Of course, there are competitors to KOH calibration. Tuo & Wu (2015) calibration simply seeks to minimise the RMSE at the expense of calibration parameter uncertainty quantification (UQ) and without considering any form of model structural bias. History matching by Bower et al. (2010) calculates a form of Mahalanobis distance to generate an implausibility score, similar to a test-statistic. Calibrate, Emulate, Simulate (CES) by Cleary et al. (2021) is a three-step procedure for building an emulator of WHAT? from which WHAT?? can be sampled.
There are also criticisisms of KOH calibration. Building good priors for the model discrepancy is non-trivial so many practioners simply ignore this concept. However, Brynjarsdóttir & O’Hagan 2014 show that without a model discrepancy term the posterior calibration parameter estimates from KOH are biased. Yet with the model discrepancy term the KOH model posterior is unidentifiable. Plumlee 2017 tries to fix this by forcing the model discrepancy to be orthogonal the simulator, but this is computationally expensive. To achieve good posterior samples, the MCMC algorithm needs carefully chosen priors requiring detailed discussions between statisticians and domain experts. Consequently, KOH calibration cannot be easily applied by practioners which has hindered it’s wider adoption in the field despite it’s initial publication almost 25 years ago.
I presented this topic as part of the UCL Department of Statistical Science PhD Seminar Series back in February 2024. A recording of this presentation from YouTube is embedded below.