Stopping Criterion¶

Nested sampling has no well-defined stopping point. As iterations continue, the active points sample a smaller and smaller region of prior space. This can continue indefinitely (up to the limits of your machine). Unlike typical MCMC methods, we don’t gain any additional precision on the results by letting the algorithm run longer; the precision is determined at the outset by the number of active points. So, we want to stop iterations as soon as we think the active points are doing a “pretty good job” sampling the remaining prior volume – once we’ve converged to the highest-likelihood regions such that the likelihood is relatively flat within the remaining prior volume.

For an arbitrary likelihood surface, there’s no way to ever guarantee that we’ve found the highest likelihood region. There could always be some vanishingly small volume with such high likelihood that it dominates the evidence integral (but this is a generic problem with sampling methods). In practice, there are a couple heuristic stopping criteria that work pretty well for determining when we’ve found the highest-likelihood regions.

Estimated remaining evidence¶

In this criteria, we estimate the maximum remaining evidence and stop when the contribution of the remaining evidence to the total evidence falls below some threshold. The remaining evidence is difficult to estimate (this is what we are trying to calculate in the algorithm after all!), but the current active points give some idea of how much evidence could remain. More specifically, the current active point with the highest likelihood gives an estimate of the higest likelihood in the region. We estimate the remaining evidence at iteration i according to

\[\mathcal{Z}_\mathrm{est} = \mathcal{L}_\mathrm{max} X_i\]

where \(X_i\) is the remaining prior volume. The stopping criteria is then based on the ratio between the estimated total evidence and the current evidence:

\[\log (\mathcal{Z}_i + \mathcal{Z}_\mathrm{est}) - \log \mathcal{Z}_i < \mathrm{thresh}\]

where thresh is a user-defined parameter that defaults to 0.5. This is the dlogz option in nestle.sample.

One can show that the expectation value of \(\mathcal{L}_\mathrm{max}\) is \(\mathcal{L}_{i + N log N}\), where N is the number of active points. This means that the highest-likelihood active point will be equivalent to a point saved \(N log N\) iterations later, after the remaining volume has shrunk by a factor of N.

Declining weight of saved samples¶

This method is based on looking for a flattening of the likelihood region being traversed. While the likelihood of saved samples will always increase, the increase will become smaller and smaller as the likelihood flattens. We look for a critical point where the likelihood increase becomes smaller than the decrease in the prior volume.

\[\begin{split}\mathcal{L}_{i+1} / \mathcal{L}_i &< X_i / X_{i+1} \\ &< e^{1/N} \\ \log \mathcal{L}_{i+1} - \log \mathcal{L}_i &< 1/N\end{split}\]

This is equivalent to a declining sample weight (likelihood times sample prior volume). There is significant sampling noise in the likelihood values, so we cannot stop immediately once this condition is fulfilled. Instead, we look stop when this condition is fulfilled for a number of repeated samples. In nestle.sample, this is set by the decline_factor parameter, which instructs the algorithm to stop after decline_factor * nsamples consecutive samples fulfilling the condition. This stopping criteria is inactive by default.