%0 Journal Article
%T Growth Estimators and Confidence Intervals for the Mean of Negative Binomial Random Variables with Unknown Dispersion
%A David Shilane
%A Derek Bean
%J Journal of Probability and Statistics
%D 2013
%I Hindawi Publishing Corporation
%R 10.1155/2013/602940
%X The negative binomial distribution becomes highly skewed under extreme dispersion. Even at moderately large sample sizes, the sample mean exhibits a heavy right tail. The standard normal approximation often does not provide adequate inferences about the data's expected value in this setting. In previous work, we have examined alternative methods of generating confidence intervals for the expected value. These methods were based upon Gamma and Chi Square approximations or tail probability bounds such as Bernstein's inequality. We now propose growth estimators of the negative binomial mean. Under high dispersion, zero values are likely to be overrepresented in the data. A growth estimator constructs a normal-style confidence interval by effectively removing a small, predetermined number of zeros from the data. We propose growth estimators based upon multiplicative adjustments of the sample mean and direct removal of zeros from the sample. These methods do not require estimating the nuisance dispersion parameter. We will demonstrate that the growth estimators' confidence intervals provide improved coverage over a wide range of parameter values and asymptotically converge to the sample mean. Interestingly, the proposed methods succeed despite adding both bias and variance to the normal approximation. 1. Introduction Confidence intervals are routinely applied to limited samples of data based upon their asymptotic properties. For instance, the central limit theorem states that the sample mean will have an approximately Normal distribution for large sample sizes provided that the data’s second moment is finite. This Normal approximation is a fundamental tool for inferences about the data’s expected value in a wide variety of settings. Confidence intervals for the mean are often based upon Normal quantiles even when the sample size is very moderate (e.g. 30 or 50). However, the Normal approximation’s quality cannot be ensured for highly skewed distributions [1]. In this setting, the sample mean may converge to the Normal in distribution at a much slower rate. The Negative Binomial distribution is known to have an extremely heavy right tail, especially under high dispersion. In previous work [2], we established that the Normal confidence interval significantly undercovers the mean at moderate sample sizes. We also suggested alternatives based upon Gamma and Chi Square approximations along with tail probability bounds such as Bernstein’s inequality. We now propose growth estimators for the mean. These estimators seek to account for the relative overrepresentation
%U http://www.hindawi.com/journals/jps/2013/602940/