This article deals with
correlating two variables that have values that fall below the known limit of
detection (LOD) of the measuring device; these values are known as non-detects
(NDs). We use simulation to compare several methods for estimating the
association between two such variables. The most commonly used method, simple
substitution, consists of replacing each ND with some representative value such
as LOD/2. Spearman’s correlation, in which all NDs are assumed to be tied at
some value just smaller than the LOD, is also used. We evaluate each method
under several scenarios, including small to moderate sample size, moderate to
large censoring proportions, extreme
imbalance in censoring proportions, and non-bivariate normal (BVN) data.
In this article, we focus on the coverage probability of 95% confidence
intervals obtained using each method. Confidence intervals using a maximum
likelihood approach based on the assumption of BVN data have acceptable
performance under most scenarios, even with non-BVN data. Intervals based on
Spearman’s coefficient also perform well under many conditions. The methods are
illustrated using real data taken from the biomarker literature.
References
[1]
Amorin, L. and Alvarez-Leite, E. (1997) Determination of o-cresol by Gas Chromatography and Comparison with Hippuric Acid Levels in Urine Samples of Individuals Exposed to Toluene. Journal of Toxicology Environmental Health, 50, 401-408.
https://doi.org/10.1080/009841097160438
[2]
Atawodi, S.E., Lea, S., Nyberg, F., Mukeria, A., Constantinescu, V., Ahrens, W., et al. (1998) 4-Hydroxyl-1-(3-pyridyl)-1-Butanone-Hemoglobin Adducts as Biomarkers of Exposure to Tobacco Smoke: Validation of a Method to be Used in Multicenter Studies. Cancer Epidemiology Biomarkers and Prevention, 7, 817-821.
[3]
Lagorio, S., Crebelli, R., Ricciarello, R., Conti, L., Iavarone, I., Zona, A., Ghittori, S. and Carere, A. (1998) Methodological Issues in Biomonitoring of Low Level Exposure to Benzene. Occupational Medicine, 8, 497-504.
https://doi.org/10.1093/occmed/48.8.497
[4]
Cook, D.G., Whincup, P.H., Papacosta, O., Strachan, D.P., Jarvis, M.J. and Bryant, A. (1993) Relation of Passive Smoking as Assessed by Salivary Cotinine Concentration and Questionnaire to Spirometric Indices in Children. Thorax, 48, 14-20.
https://doi.org/10.1136/thx.48.1.14
[5]
Wang, H. (2006) Correlation Analysis for Left-Censored Biomarker Data with Known Detection Limits. Unpublished Master’s Thesis, Louisiana State University Health Sciences Center, School of Public Health, Biostatistics Program, New Orleans, Louisiana.
[6]
Lyles, R.H., Williams, J.K. and Chuachoowong, R. (2001) Correlating Two Viral Load Assays with Known Detection Limits. Biometrics, 57, 1238-1244.
https://doi.org/10.1111/j.0006-341X.2001.01238.x
[7]
Scheuren, F. (2005) Multiple Imputation: How It Began and Continues. The American Statistician, 59, 315-319. https://doi.org/10.1198/000313005X74016
[8]
Lynn, H. (2001) Maximum Likelihood Inference for Left-Censored HIV RNA Data. Statistics in Medicine, 20, 33-45.
https://doi.org/10.1002/1097-0258(20010115)20:1%3C33::AID-SIM640%3E3.0.CO;2-O
[9]
McCracken, C.E. (2013) Correlation Coefficient Inference for Left-Censored Biomarker Data with Known Detection Limits. Unpublished Ph.D. Dissertation, Augusta University, Department of Biostatistics, Augusta, Georgia.
[10]
Li, L., Wang, W. and Chan, I. (2004) Correlation Coefficient Inference on Censored Bioassay Data. Journal of Biopharmaceutical Statistics, 15, 501-512.
https://doi.org/10.1081/BIP-200056552
[11]
Gibbons, J. and Chakraborti, S. (2003) Nonparametric Statistical Inference. 4th Edition, Marcel Dekker Inc., New York.
[12]
Newton, E. and Rudel, R. (2007) Estimating Correlation with Multiply Censored Data Arising from the Adjustment of Singly Censored Data. Environmental Science and Technology, 41, 221-228. https://doi.org/10.1021/es0608444
[13]
Bradley, J.V. (1978) Robustness? British Journal of Mathematical and Statistical Psychology, 31, 144-152. https://doi.org/10.1111/j.2044-8317.1978.tb00581.x
[14]
Weaver, B.P., Kaufeld, K. and Warr, R. (2020) Estimating Correlations with Censored Data. Quality Engineering, 32, 521-527.
https://doi.org/10.1080/08982112.2019.1698744
[15]
Jones, M.P., Perry, S.S. and Thorne, P.S. (2015) Maximum Pairwise Pseudo-Likelihood Estimation of the Covariance Matrix from Left Censored Data. Journal of Agricultural, Biological, and Environmental Statistics, 20, 83-99.
https://doi.org/10.1007/s13253-014-0185-y
[16]
Pesonen M., Pesonen, H. and Nevalainen, J. (2015) Covariance Matrix Estimation for Left-Censored Data, Computational Statistics and Data Analysis, 92, 13-25.
https://doi.org/10.1016/j.csda.2015.06.005
[17]
Domthong, U. (2014) A New Class of Bivariate Weibull Distribution to Accommodate the Concordance Correlation Coefficient for Left-Censored Data. Unpublished Ph.D. Dissertation, Pennsylvania State University, Department of Public Health Sciences, Hershey, Pennsylvania.
[18]
Lapidus, N., Chevret, S. and Resche-Rigon, M. (2014) Assessing Assay Agreement Estimation for Multiple Left-Censored Data: A Multiple Imputation Approach. Statistics in Medicine, 33, 5298-5309. https://doi.org/10.1002/sim.6319
[19]
Handelsman, D.J. and Ly, L.P. (2019) An Accurate Substitution Method to Minimize Left Censoring Bias in Serum Steroid Measurements. Endocrinology, 160, 2395-2400. https://doi.org/10.1210/en.2019-00340
[20]
Li, Y., Gillespie, B.W., Shedden, K. and Gillespie, J.A. (2018) Profile Likelihood Estimation of the Correlation Coefficient in the Presence of Left, Right or Interval Censoring and Missing Data. The R Journal, 10, 159-179.
https://doi.org/10.32614/RJ-2018-040