We consider the efficacy of
a proposed linear-dimension-reduction method to potentially increase the powers
of five hypothesis tests for the difference of two high-dimensional
multivariate-normal population-mean vectors with the assumption of
homoscedastic covariance matrices. We use Monte Carlo simulations to contrast
the empirical powers of the five high-dimensional tests by using both the
original data and dimension-reduced data. From the Monte Carlo simulations, we
conclude that a test by Thulin [1], when performed with post-dimension-reduced
data, yielded the best omnibus power for detecting a difference between two
high-dimensional population-mean vectors. We also illustrate the utility of our
dimension-reduction method real data consisting of genetic sequences of two
groups of patients with Crohn’s disease and ulcerative colitis.
References
[1]
Thulin, M. (2014) A High-Dimensional Two-Sample Test for the Mean Using Random Subspaces. Computational Statistics and Data Analysis, 74, 26-38.
https://doi.org/10.1016/j.csda.2013.12.003
[2]
Dempster, A.P. (1958) A High Dimensional Two Sample Significance Test. Annals of Mathematical Statistics, 29, 995-1010. https://doi.org/10.1214/aoms/1177706437
[3]
Bai, Z. and Saranadasa, H. (1996) Effect of High Dimension: By an Example of a Two Sample Problem, Statistica Sinica, 6, 311-329.
[4]
Srivastava, M.S. (2007) Multivariate Theory for Analyzing High Dimensional Data. The Journal of the Japan Statistical Society, 37, 53-86.
https://doi.org/10.14490/jjss.37.53
[5]
Srivastava, M.S. and Du, M. (2008) A Test for the Mean Vector with Fewer Observations than the Dimension. Journal of Multivariate Analysis, 99, 386-402.
https://doi.org/10.1016/j.jmva.2006.11.002
[6]
Park, J. and Ayyala, D.N. (2013) A Test for the Mean Vector in Large Dimension and Small Samples. Journal of Statistical Planning and Inference, 143, 929-943.
https://doi.org/10.1016/j.jspi.2012.11.001
[7]
Chen, S.X. and Qin, Y.-L.N. (2010) A Two-Sample Test for High-Dimensional Data with Applications to Gene-Set Testing. Annals of Statistics, 38, 808-835.
https://doi.org/10.1214/09-AOS716
[8]
Bickel, P. and Levina, E.N. (2008) Regularized Estimation of Large Covariance Matrices. Annals of Statistics, 36, 199-227.
https://doi.org/10.1214/009053607000000758
[9]
Cai, T. and Liu, W.N. (2011) Adaptive Thresholding for Sparse Covariance Matrix Estimation. Journal of the American Statistical Association, 106, 672-684.
https://doi.org/10.1198/jasa.2011.tm10560
[10]
Feng, L., Zou, C.N. and Wang, Z. (2016) Multivariate-Sign-Based High-Dimensional Tests for the Two-Sample Location Problem. Journal of the American Statistical Association, 111, 721-735. https://doi.org/10.1080/01621459.2015.1035380
[11]
Chen, L.S., Paul, D., Prentice, R.L. and Wang, P. (2011) A Regularized Hotelling’s T2 Test for Pathway Analysis in Proteomic Studies. Journal of the American Statistical Association, 106, 1345-1360. https://doi.org/10.1198/jasa.2011.ap10599
[12]
Lopes, M., Jacob, L. and Wainright, M.J. (2011) A More Powerful Two-Sample Test in High Dimensions Using Random Projection. In: Shawe-Taylor, J., Zemel, R.S., Bartlett, P.L., Pereira, F. and Weinberger, K.Q., Eds., Advances in Neural Information Processing Systems, Vol. 24, Curran Associates Inc., Red Hook, 1206-1214.
[13]
Zhang, J. and Pan, M. (2016) A High-Dimension Two-Sample Test for the Mean Using Cluster Subspaces. Computational Statistics and Data Analysis, 97, 87-97.
https://doi.org/10.1016/j.csda.2015.12.004
[14]
Srivastava, R., Li, P. and Ruppert, D. (2016) RAPPT: An Exact Two-Sample Test in High Dimensions Using Random Projections. Journal of Computational and Graphical Statistics, 25, 954-970. https://doi.org/10.1080/10618600.2015.1062771
[15]
He, Y., Zhang, M., Zhang, X. and Zhou, W. (2020) High-Dimensional Two-Sample Mean Vectors Test and Support Recovery with Factor Adjustment. Computational Statistics and Data Analysis, 151, Article ID: 107004.
https://doi.org/10.1016/j.csda.2020.107004
[16]
Burczynski, M.E., Peterson, R.L., Twine, N.C., Zuberek, K.A., Brodeur, B.J., Casciotti, L., Maganti, V., Reddy, P.S., Strahs, A., Immermann, F., Spinelli, W., Schwertschlag, U., Slager, A.M., Cotreau, M.M. and Dorner, A.J. (2006) Molecular Classification of Crohn’s Disease and Ulcerative Colitis Patients Using Transcriptional Profiles in Peripheral Blood Mononuclear Cells. The Journal of Molecular Diagnostics, 8, 51-61. https://doi.org/10.2353/jmoldx.2006.050079