Zoonotic diseases can be transmitted via an arthropod vector, and disease risk maps are often created based on underlying associative factors within the surrounding landscape of known occurrences. A limitation however is the ability to map disease risk at a meaningful geographic scale, and traditional regression modeling approaches may not always be appropriate. Our objective was to determine if nonlinear modeling could improve explanatory power in describing the occurrence of 2 tick-borne diseases (Lyme disease (LD) and Rocky Mountain spotted fever (RMSF)) known to occur in Tennessee. Medically diagnosed cases of LD (ICD-9: 088.81) and RMSF (ICD-9: 082.0) were extracted from a managed care organization data warehouse for the 2000–2009 time period. Four separate modeling techniques were constructed (logistic regression, classification and regression tree (CART), gradient boosted tree (GBT), and neural network (NNET)) and compared for accuracy. Results suggest that areas higher in disease prevalence were not necessarily the same areas having high predicted disease risk. GBT best explained LD occurrence (misclassification rate: 0.232; ROC: 0.789). RMSF prevalence was best explained with an NNET algorithm (misclassification rate: 0.288; ROC: 0.696). Covariates explaining disease risk included forested wetlands, urbanization, and median income. Nonlinear modeling may provide better results than traditional regression-based approaches. 1. Introduction Because zoonotic diseases are transmitted via an arthropod vector, it is often of interest to understand vector habitat in the epidemiologic study of diseases. It is common in spatial epidemiology to describe vector habitat and then create causal inference risk maps of potentially high-risk areas based on habitat preferences [1, 2]. These geospatial mapping exercises outline areas having high probabilities of vector prevalence and then infer disease risk based on probable presence or absence. For example, abundance of the tick genus Ixodes, one of which is the vector primarily responsible for the transmission of Lyme disease (LD), is associated with temperature, landscape slope [3], forested areas with sandy soils [4], and increasing residential development [5]. Tularemia prevalence is positively associated with dry forested habitat areas [6]. Human populations living within forested areas and on specific soils are at higher risk of contracting LD [7, 8]. Human monocytic ehrlichiosis (HME or Ehrlichia chaffeensis) is more associated with wooded habitats compared to neighboring grassy areas [9]. A major
References
[1]
M. C. Wimberly, A. D. Baer, and M. J. Yabsley, “Enhanced spatial models for predicting the geographic distributions of tick-borne pathogens,” International Journal of Health Geographics, vol. 7, p. 15, 2008.
[2]
A. M. Winters, R. J. Eisen, S. Lozano-Fuentes, C. G. Moore, W. J. Pape, and L. Eisen, “Predictive spatial models for risk of West Nile virus exposure in eastern and western Colorado,” American Journal of Tropical Medicine and Hygiene, vol. 79, no. 4, pp. 581–590, 2008.
[3]
R. S. Lane and H. A. Stubbs, “Host-seeking behavior of adult Ixodes pacificus (Acari: Ixodidae) as determined by flagging vegetation,” Journal of Medical Entomology, vol. 27, no. 3, pp. 282–287, 1990.
[4]
U. Kitron, C. J. Jones, J. K. Bouseman, J. A. Nelson, and D. L. Baumgartner, “Spatial analysis of the distribution of Ixodes dammini (Acari: Ixodidae) on white-tailed deer in Ogle County, Illinois,” Journal of Medical Entomology, vol. 29, no. 2, pp. 259–266, 1992.
[5]
S. Aronoff, Geographic Information Systems: A Management Perspective, WDL Publications, Ottawa, Canada, 1989.
[6]
R. J. Eisen, P. S. Mead, A. M. Meyer, L. E. Pfaff, K. K. Bradley, and L. Eisen, “Ecoepidemiology of tularemia in the Southcentral United States,” American Journal of Tropical Medicine and Hygiene, vol. 78, no. 4, pp. 586–594, 2008.
[7]
G. E. Glass, B. S. Schwartz, J. M. Morgan, D. T. Johnson, P. M. Noy, and E. Israel, “Environmental risk factors for Lyme disease identified with geographic information systems,” American Journal of Public Health, vol. 85, no. 7, pp. 944–948, 1995.
[8]
M. E. Killilea, A. Swei, R. S. Lane, C. J. Briggs, and R. S. Ostfeld, “Spatial dynamics of lyme disease: a review,” EcoHealth, vol. 5, no. 2, pp. 167–195, 2008.
[9]
H. Gaff and E. Schaefer, “Metapopulation models in tick-borne disease transmission modelling,” Advances in Experimental Medicine and Biology, vol. 673, pp. 51–65, 2010.
[10]
L. Eisen and R. J. Eisen, “Need for improved methods to collect and present spatial epidemiologic data for vectorborne diseases,” Emerging Infectious Diseases, vol. 13, no. 12, pp. 1816–1820, 2007.
[11]
R. Sugumaran, S. R. Larson, and J. P. DeGroote, “Spatio-temporal cluster analysis of county-based human West Nile virus incidence in the continental United States,” International Journal of Health Geographics, vol. 8, no. 1, p. 43, 2009.
[12]
F. Mostashari, M. Kulldorff, J. J. Hartman, J. R. Miller, and V. Kulasekera, “Dead bird clusters as an early warning system for West Nile virus activity,” Emerging Infectious Diseases, vol. 9, no. 6, pp. 641–646, 2003.
[13]
R. J. Eisen, R. S. Lane, C. L. Fritz, and L. Eisen, “Spatial patterns of lyme disease risk in California based on disease incidence data and modeling of vector-tick exposure,” American Journal of Tropical Medicine and Hygiene, vol. 75, no. 4, pp. 669–676, 2006.
[14]
S. G. Jones, W. Conner, B. Song, D. Gordon, and A. Jayakaran, “Comparing spatio-temporal clusters of arthropod-borne infections using administrative medical claims and state reported surveillance data,” Spatial and Spatio-Temporal Epidemiology, vol. 3, no. 3, pp. 205–213, 2012.
[15]
S. G. Jones and M. Kulldorff, “Influence of spatial resolution on space-time disease cluster detection,” PLoS ONE. In press, http://dx.plos.org/10.1371/journal.pone.0048036. 2012.
[16]
J. Wieczorek, Q. Guo, and R. J. Hijmans, “The point-radius method for georeferencing locality descriptions and calculating associated uncertainty,” International Journal of Geographical Information Science, vol. 18, no. 8, pp. 745–767, 2004.
[17]
J. A. Bissonette, “Small sample size problems in wildlife ecology: a contingent analytical approach,” Wildlife Biology, vol. 5, no. 2, pp. 65–71, 1999.
[18]
S. G. Jones, S. Coulter, and W. Conner, “Using administrative medical claims data to supplement state disease registry systems for reporting zoonotic infections,” Journal of American Medical Informatics Association. In press.
[19]
Tennessee Wildlife Resources Agency, Tennessee Land Use/Land Cover Landsat TM imagery, Tennessee Spatial Data Service metadata files, http://www.tngis.org/frequently_accessed_data.html, 1997.
[20]
L. Cowardin, V. Carter, E. Golet, and E. LaRoe, “Classification of wetlands and deepwater habitats of the United States,” US Fish and Wildlife Service FWS/OBS 79/31, 1979.
[21]
M. Efroymson, “Multiple regression analysis,” in Mathematical Methods for Digital Computers, A. Ralston and H. S. Wilf, Eds., chapter 17, Wiley, New York, NY, USA, 1960.
[22]
L. Breiman, J. Friedman, R. Olshen, and C. Stone, Classification and Regression Trees, Wadsworth International Group, Belmont, Calif, USA, 1984.
[23]
B. De Ville, Decision Trees for Business Intelligence and Data Mining: Using SAS Enterprise Miner, SAS Publishing, Cary, NC, USA, 2006.
[24]
J. Elith, J. R. Leathwick, and T. Hastie, “A working guide to boosted regression trees,” Journal of Animal Ecology, vol. 77, no. 4, pp. 802–813, 2008.
[25]
J. H. Friedman, “Greedy function approximation: a gradient boosting machine,” Annals of Statistics, vol. 29, no. 5, pp. 1189–1232, 2001.
[26]
J. H. Friedman, “Stochastic gradient boosting,” Computational Statistics and Data Analysis, vol. 38, no. 4, pp. 367–378, 2002.
[27]
A. Lapedes and R. Farber, “Nonlinear signal processing using neural networks: prediction and system modeling,” Tech. Rep. LA-UR87-2662, Los Alamos National Laboratory, Los Alamos, NM, USA, 1987.
[28]
SAS Institute Inc, SAS Enterprise Miner 6.1: Single-User Installation Guide, SAS Institute Inc, Cary, NC, USA, 2009.
[29]
R. Wall and P. Cunningham, “Exploring the potential for rule extraction from ensembles of neural networks,” in Proceedings of the 11th Irish Conference on Artificial Intelligence and Cognitive Science, J. Griffith and C. O'Riordan, Eds., Computer Science Technical Report TCD-CS-2000-24, pp. 52–68, Trinity College, Dublin, Ireland, 2000.
[30]
A. C. Steere, S. E. Malawista, D. R. Snydman et al., “Lyme arthritis: an epidemic of oligoarticular arthritis in children and adults in three connecticut communities.,” Arthritis and Rheumatism, vol. 20, no. 1, pp. 7–17, 1977.
[31]
G. O. Maupin, D. Fish, J. Zultowsky, E. G. Campos, and J. Piesman, “Landscape ecology of Lyme disease in a residential area of Westchester County, New York,” American Journal of Epidemiology, vol. 133, no. 11, pp. 1105–1113, 1991.
[32]
R. G. McLean, S. R. Ubico, C. A. N. Hughes, S. M. Engstrom, and R. C. Johnson, “Isolation and characterization of Borrelia burgdorferi from blood of a bird captured in the Saint Croix River Valley,” Journal of Clinical Microbiology, vol. 31, no. 8, pp. 2038–2043, 1993.
[33]
H. S. Ginsberg, P. A. Buckley, M. G. Balmforth, E. Zhioua, S. Mitra, and F. G. Buckley, “Reservoir competence of native North American birds for the lyme disease spirochete, Borrelia burgdorferi,” Journal of Medical Entomology, vol. 42, no. 3, pp. 445–449, 2005.
[34]
N. H. Ogden, R. L. Lindsay, K. Hanincová et al., “Role of migratory birds in introduction and range expansion of Ixodes scapularis ticks and of Borrelia burgdorferi and Anaplasma phagocytophilum in Canada,” Applied and Environmental Microbiology, vol. 74, no. 12, pp. 3919–3919, 2008.
[35]
L. A. Magnarelli, A. Denicola, K. C. Stafford, and J. F. Anderson, “Borrelia burgdorferi in an urban environment: white-tailed deer with infected ticks and antibodies,” Journal of Clinical Microbiology, vol. 33, no. 3, pp. 541–544, 1995.
[36]
R. G. Wilkinson and K. E. Pickett, “Income inequality and population health: a review and explanation of the evidence,” Social Science and Medicine, vol. 62, no. 7, pp. 1768–1784, 2006.
[37]
A. Lusardi, D. Schneider, and P. Tufano, “The economic crisis and medical care usage. Harvard business school,” Working Paper 10-079, 2010.
[38]
Q. H. Liu, G. Y. Chen, Y. Jin et al., “Evidence for a high prevalence of spotted fever group rickettsial infections in diverse ecologic zones of Inner Mongolia,” Epidemiology and Infection, vol. 115, no. 1, pp. 177–183, 1995.
[39]
P. Parola and D. Raoult, “Ticks and tickborne bacterial diseases in humans: an emerging infectious threat,” Clinical Infectious Diseases, vol. 32, no. 6, pp. 897–928, 2001.
[40]
E. J. Masters, G. S. Olson, S. J. Weiner, and C. D. Paddock, “Rocky Mountain spotted fever: a clinician's dilemma,” Archives of Internal Medicine, vol. 163, no. 7, pp. 769–774, 2003.
[41]
C. G. Helmick, K. W. Bernard, and L. J. D'Angelo, “Rocky Mountain spotted fever: clinical, laboratory, and epidemiological features of 262 cases,” Journal of Infectious Diseases, vol. 150, no. 4, pp. 480–488, 1984.