Abstract:
A Markov chain is geometrically ergodic if it converges to its in- variant distribution at a geometric rate in total variation norm. We study geo- metric ergodicity of deterministic and random scan versions of the two-variable Gibbs sampler. We give a sufficient condition which simultaneously guarantees both versions are geometrically ergodic. We also develop a method for simul- taneously establishing that both versions are subgeometrically ergodic. These general results allow us to characterize the convergence rate of two-variable Gibbs samplers in a particular family of discrete bivariate distributions.

Abstract:
In any Markov chain Monte Carlo analysis, rapid convergence of the chain to its target probability distribution is of practical and theoretical importance. A chain that converges at a geometric rate is geometrically ergodic. In this paper, we explore geometric ergodicity for two-component Gibbs samplers which, under a chosen scanning strategy, evolve by combining one-at-a-time updates of the two components. We compare convergence behaviors between and within three such strategies: composition, random sequence scan, and random scan. Our main results are twofold. First, we establish that if the Gibbs sampler is geometrically ergodic under any one of these strategies, so too are the others. Further, we establish a simple and verifiable set of sufficient conditions for the geometric ergodicity of the Gibbs samplers. Our results are illustrated using two examples.

Abstract:
We consider various versions of adaptive Gibbs and Metropolis within-Gibbs samplers, which update their selection probabilities (and perhaps also their proposal distributions) on the fly during a run, by learning as they go in an attempt to optimise the algorithm. We present a cautionary example of how even a simple-seeming adaptive Gibbs sampler may fail to converge. We then present various positive results guaranteeing convergence of adaptive Gibbs samplers under certain conditions.

Abstract:
We analyze the problem of preparing quantum Gibbs states of lattice spin Hamiltonians with local and commuting terms on a quantum computer and in nature. Our central result is an equivalence between the behavior of correlations in the Gibbs state and the mixing time of the semigroup which drives the system to thermal equilibrium (the Gibbs sampler). We introduce a framework for analyzing the correlation and mixing characteristics of quantum Gibbs states and quantum Gibbs samplers, which is rooted in the theory of non-commutative Lp spaces. We consider two distinct classes of Gibbs samplers, one of which being the well-studied Davies generators modelling the dynamics on the system due to weak-coupling with a large Markovian environment. We show that their gap is independent of system size if, and only if, a certain strong form of clustering of correlations holds in the Gibbs state. As concrete applications of our formalism, we show that for every one-dimensional lattice system, or for systems in lattices of any dimension at high enough temperatures, the Gibbs samplers of commuting Hamiltonians are always gapped, giving an efficient way of preparing these states on a quantum computer.

Abstract:
We consider various versions of adaptive Gibbs and Metropolis-within-Gibbs samplers, which update their selection probabilities (and perhaps also their proposal distributions) on the fly during a run by learning as they go in an attempt to optimize the algorithm. We present a cautionary example of how even a simple-seeming adaptive Gibbs sampler may fail to converge. We then present various positive results guaranteeing convergence of adaptive Gibbs samplers under certain conditions.

Abstract:
We establish some results for the rate of convergence in total variation of a Gibbs sampler to its equilibrium distribution. This sampler is motivated by a hierarchical Bayesian inference construction for a gamma random variable. Our results apply to a wide range of parameter values in the case that the hierarchical depth is 3 or 4, and are more restrictive for depth greater than 4. Our method involves showing a relationship between the total variation of two ordered copies of our chain and the maximum of the ratios of their respective co-ordinates. We construct auxiliary stochastic processes to show that this ratio does converge to 1 at a geometric rate.

Abstract:
We establish tight results for rapid mixing of Gibbs samplers for the Ferromagnetic Ising model on general graphs. We show that if \[(d-1)\tanh\beta<1,\] then there exists a constant C such that the discrete time mixing time of Gibbs samplers for the ferromagnetic Ising model on any graph of n vertices and maximal degree d, where all interactions are bounded by $\beta$, and arbitrary external fields are bounded by $Cn\log n$. Moreover, the spectral gap is uniformly bounded away from 0 for all such graphs, as well as for infinite graphs of maximal degree d. We further show that when $d\tanh\beta<1$, with high probability over the Erdos-Renyi random graph $G(n,d/n)$, it holds that the mixing time of Gibbs samplers is \[n^{1+\Theta({1}/{\log\log n})}.\] Both results are tight, as it is known that the mixing time for random regular and Erdos-Renyi random graphs is, with high probability, exponential in n when $(d-1)\tanh\beta>1$, and $d\tanh\beta>1$, respectively. To our knowledge our results give the first tight sufficient conditions for rapid mixing of spin systems on general graphs. Moreover, our results are the first rigorous results establishing exact thresholds for dynamics on random graphs in terms of spatial thresholds on trees.

Abstract:
We consider Gibbs and block Gibbs samplers for a Bayesian hierarchical version of the one-way random effects model. Drift and minorization conditions are established for the underlying Markov chains. The drift and minorization are used in conjunction with results from J. S. Rosenthal [J. Amer. Statist. Assoc. 90 (1995) 558-566] and G. O. Roberts and R. L. Tweedie [Stochastic Process. Appl. 80 (1999) 211-229] to construct analytical upper bounds on the distance to stationarity. These lead to upper bounds on the amount of burn-in that is required to get the chain within a prespecified (total variation) distance of the stationary distribution. The results are illustrated with a numerical example.

Abstract:
The Partially Collapsed Gibbs (PCG) sampler offers a new strategy for improving the convergence of a Gibbs sampler. PCG achieves faster convergence by reducing the conditioning in some of the draws of its parent Gibbs sampler. Although this can significantly improve convergence, care must be taken to ensure that the stationary distribution is preserved. The conditional distributions sampled in a PCG sampler may be incompatible and permuting their order may upset the stationary distribution of the chain. Extra care must be taken when Metropolis-Hastings (MH) updates are used in some or all of the updates. Reducing the conditioning in an MH within Gibbs sampler can change the stationary distribution, even when the PCG sampler would work perfectly if MH were not used. In fact, a number of samplers of this sort that have been advocated in the literature do not actually have the target stationary distributions. In this article, we illustrate the challenges that may arise when using MH within a PCG sampler and develop a general strategy for using such updates while maintaining the desired stationary distribution. Theoretical arguments provide guidance when choosing between different MH within PCG sampling schemes. Finally we illustrate the MH within PCG sampler and its computational advantage using several examples from our applied work.