In stochastic variational inference, the variational Bayes objective function is optimized using stochastic gradient approximation, where gradients computed on small random subsets of data are used to approximate the true gradient over the whole data set. This enables complex models to be fit to large data sets as data can be processed in mini-batches. In this article, we extend stochastic variational inference for conjugate-exponential models to nonconjugate models and present a stochastic nonconjugate variational message passing algorithm for fitting generalized linear mixed models that is scalable to large data sets. In addition, we show that diagnostics for prior-likelihood conflict, which are useful for Bayesian model criticism, can be obtained from nonconjugate variational message passing automatically, as an alternative to simulation-based Markov chain Monte Carlo methods. Finally, we demonstrate that for moderate-sized data sets, convergence can be accelerated by using the stochastic version of nonconjugate variational message passing in the initial stage of optimization before switching to the standard version.