All Title Author
Keywords Abstract


Multivariate Modality Inference Using Gaussian Kernel

DOI: 10.4236/ojs.2014.45041, PP. 419-434

Keywords: Modality, Kernel Density Estimate, Mode, Clustering

Full-Text   Cite this paper   Add to My Lib

Abstract:

The number of modes (also known as modality) of a kernel density estimator (KDE) draws lots of interests and is important in practice. In this paper, we develop an inference framework on the modality of a KDE under multivariate setting using Gaussian kernel. We applied the modal clustering method proposed by [1] for mode hunting. A test statistic and its asymptotic distribution are derived to assess the significance of each mode. The inference procedure is applied on both simulated and real data sets.

References

[1]  Li, J., Ray, S. and Lindsay, B.G. (2007) A Nonparametric Statistical Approach to Clustering via Mode Identification. Journal of Machine Learning Research, 8, 1687-1723.
[2]  Tibshirani, R., Walther, G. and Hastie, T. (2001) Estimating the Number of Clusters in a Data Set via the Gap Statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63, 411-423.
http://dx.doi.org/10.1111/1467-9868.00293
[3]  McLachlan, G. and Peel, D. (2004) Finite Mixture Models. Wiley, Hoboken.
[4]  Lloyd, S. (1982) Least Squares Quantization in PCM. IEEE Transactions on Information Theory, 28, 129-137.
http://dx.doi.org/10.1109/TIT.1982.1056489
[5]  Fraley, C. and Raftery, A.E. (2002) Model-Based Clustering, Discriminant Analysis, and Density Estimation. Journal of the American Statistical Association, 97, 611-631.
http://dx.doi.org/10.1198/016214502760047131
[6]  Silverman, B.W. (1981) Using Kernel Density Estimates to Investigate Multimodality. Journal of the Royal Statistical Society, Series B (Methodological), 43, 97-99.
[7]  Efron, B. (1979) Bootstrap Methods: Another Look at the Jackknife. The Annals of Statistics, 7, 1-26.
http://dx.doi.org/10.1214/aos/1176344552
[8]  Minnotte, M.C. (1997) Nonparametric Testing of the Existence of Modes. The Annals of Statistics, 25, 1646-1660.
http://dx.doi.org/10.1214/aos/1031594735
[9]  Burman, P. and Polonik, W. (2009) Multivariate Mode Hunting: Data Analytic Tools with Measures of Significance. Journal of Multivariate Analysis, 100, 1198-1218.
http://dx.doi.org/10.1016/j.jmva.2008.10.015
[10]  Fukunaga, K. (1990) Introduction to Statistical Pattern Recognition. Academic Press, Waltham.
[11]  Scott, D.W. (1992) Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley, New York.
[12]  Li, Q. and Racine, J.S. (2011) Nonparametric Econometrics: Theory and Practice. Princeton University Press, Princeton.
[13]  Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977) Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B (Methodological), 39, 1-38.
[14]  Ray, S. and Lindsay, B.G. (2005) The Topography of Multivariate Normal Mixtures. The Annals of Statistics, 33, 2042-2065. http://dx.doi.org/10.1214/009053605000000417
[15]  Dmitrienko, A., Tamhane, A.C. and Bretz, F. (2010) Multiple Testing Problems in Pharmaceutical Statistics. CRC Press, Boca Raton.
[16]  Ray, S. and Pyne, S. (2012) A Computational Framework to Emulate the Human Perspective in Flow Cytometric Data Analysis. PloS One, 7, Article ID: e35693.
http://dx.doi.org/10.1371/journal.pone.0035693
[17]  Flury, B. and Riedwyl, H. (1988) Multivariate Statistics: A Practical Approach. Chapman & Hall, Ltd., London.
http://dx.doi.org/10.1007/978-94-009-1217-5
[18]  Lindsay, B.G., Markatou, M., Ray, S., Yang, K. and Chen, S.C. (2008) Quadratic Distances on Probabilities: A Unified Foundation. The Annals of Statistics, 36, 983-1006.
http://dx.doi.org/10.1214/009053607000000956

Full-Text

comments powered by Disqus