All Title Author
Keywords Abstract

PLOS ONE  2012 

Repeatability and Reproducibility of Decisions by Latent Fingerprint Examiners

DOI: 10.1371/journal.pone.0032800

Full-Text   Cite this paper   Add to My Lib


The interpretation of forensic fingerprint evidence relies on the expertise of latent print examiners. We tested latent print examiners on the extent to which they reached consistent decisions. This study assessed intra-examiner repeatability by retesting 72 examiners on comparisons of latent and exemplar fingerprints, after an interval of approximately seven months; each examiner was reassigned 25 image pairs for comparison, out of total pool of 744 image pairs. We compare these repeatability results with reproducibility (inter-examiner) results derived from our previous study. Examiners repeated 89.1% of their individualization decisions, and 90.1% of their exclusion decisions; most of the changed decisions resulted in inconclusive decisions. Repeatability of comparison decisions (individualization, exclusion, inconclusive) was 90.0% for mated pairs, and 85.9% for nonmated pairs. Repeatability and reproducibility were notably lower for comparisons assessed by the examiners as “difficult” than for “easy” or “moderate” comparisons, indicating that examiners' assessments of difficulty may be useful for quality assurance. No false positive errors were repeated (n = 4); 30% of false negative errors were repeated. One percent of latent value decisions were completely reversed (no value even for exclusion vs. of value for individualization). Most of the inter- and intra-examiner variability concerned whether the examiners considered the information available to be sufficient to reach a conclusion; this variability was concentrated on specific image pairs such that repeatability and reproducibility were very high on some comparisons and very low on others. Much of the variability appears to be due to making categorical decisions in borderline cases.


[1]  National Research Council (2009) Strengthening forensic science in the United States: A path forward. (Natl Acad Press, Washington, DC).
[2]  Evett IE, Williams RL (1996) A review of the sixteen points fingerprint standard in England and Wales. J Forensic Identification 46: 49–73. Available: http://www.thefingerprintinquiryscotland?
[3]  Gutowski S (2006) Error rates in fingerprint examination: the view in 2006. The Forensic Bulletin Autumn 2006: 18–19.
[4]  Langenburg G (2009) A Performance study of the ACE-V process: a pilot study to measure the accuracy, precision, reproducibility, and the biasability of conclusions resulting from the ACE-V process. J Forensic Identification 59(2): 219–257.
[5]  Ulery BT, Hicklin RA, Buscaglia J, Roberts MA (2011) Accuracy and reliability of forensic latent fingerprint decisions. Proc Natl Acad Sci USA 108(19): 7733–7738. Available:
[6]  Dror IE, Charlton D, Péron AE (2006) Contextual information renders experts vulnerable to making erroneous identifications. Forensic Sci Int 156: 74–78. Available:
[7]  Dror IE, Charlton D (2006) Why experts make errors. J Forensic Identification 56(4): 600–616. Available:
[8]  Risinger DM, Saks MJ, Thompson WC, Rosenthal R (2002) The Daubert/Kumho implications of observer effects in forensic science: hidden problems of expectation and suggestion. Calif Law Rev 90(1): 1–56. Available:
[9]  Langenburg G, Champod C, Wertheim P (2009) Testing for potential contextual bias effects during the verification stage of the ACE-V methodology when conducting fingerprint comparisons. J Forensic Sciences 54(3): 571–582.
[10]  SWGFAST (2011) Standards for examining friction ridge impressions and resulting conclusions, version 1.0. Available:
[11]  Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. New York: Chapman and Hall, 1993.
[12]  Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5): 378–382.
[13]  Gwet K (2002) Kappa statistic is not satisfactory for assessing the extent of agreement between raters. Series: Statistical Methods for Inter-Rater Reliability Assessment 1(1): 1–5. Available:
[14]  Rust RT, Cooil B (1994) Reliability measures for qualitative data: theory and implications. J Marketing Res 31: 1–14. Available:
[15]  Perrault WD Jr, Leigh LE (1989) Reliability of nominal data based on qualitative judgments. J Marketing Res 26: 135–148. Available:
[16]  Brennan RL, Prediger DJ (1981) Coefficient kappa: some uses, misuses, and alternatives. Educational and Psychological Measurement 41: 687–699.
[17]  Schuckers ME (2003) Using the beta-binomial distribution to assess performance of a biometric identification device. Int J Image Graph 3(3): 523–529.
[18]  Uebersax JS (1992) A review of modeling approaches for the analysis of observer agreement. Investigative Radiology 17: 738–743.
[19]  Schiffer B, Champod C (2006) The potential (negative) influence of observational biases at the analysis stage of fingermark individualisation. Forensic Sci Int 167(2–3): 116–120. Available:
[20]  Dror IE, Champod C, Langenburg G, Charlton D, Hunt H, et al. (2011) Cognitive issues in fingerprint analysis: inter- and intra-expert consistency and the effect of a ‘target’ comparison. Forensic Sci Int 208: 10–17. Available: http://cognitiveconsultantsinternational?.com/Dror_FSI_cognitive_issues_fingerpri?nt_analysis.pdf.
[21]  SWGFAST (2011) Standard for the application of blind verification of friction ridge examinations, version 1.0. Available:


comments powered by Disqus

Contact Us


微信:OALib Journal