全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Reproducibility in Transportation Research: Importance, Best Practices, and Dealing with Protected and Sensitive Data

DOI: 10.4236/jtts.2025.151010, PP. 179-202

Keywords: Reproducibility, Openness, Transparency, Scientific Method, Responsible Research

Full-Text   Cite this paper   Add to My Lib

Abstract:

Reproducibility is a key aspect of the scientific method as it provides evidence for research claims. It is essential to promote openness, accessibility, and collaboration within the scientific community. This article aims to provide an introduction to best practices in reproducibility that are relevant to the transportation research community, to discuss issues and barriers to reproducibility, and to describe methods for addressing these issues. This article starts by discussing openness and transparency, then discusses several key best practices for reproducibility in transportation engineering, highlighting common methods and techniques, as well as the associated benefits. The paper concludes with a discussion of the key barriers to implementing reproducibility practices in transportation research and potential solutions. The barriers include existing culture and attitudes, data sensitivity, insufficient methodological detail, lack of code sharing, limited validation, additional time and research burden, and skill and knowledge gaps. Discussing each of these items provides an opportunity for the transportation research community to evolve to become one that embraces the openness and transparency of reproducibility.

References

[1]  Socol, Y., Shaki, Y.Y. and Yanovskiy, M. (2019) Interests, Bias, and Consensus in Science and Regulation. Dose-Response, 17.
https://doi.org/10.1177/1559325819853669
[2]  Open Science Collaboration (2015) Estimating the Reproducibility of Psychological Science. Science, 349, aac4716.
https://doi.org/10.1126/science.aac4716
[3]  Gundersen, O.E. (2021) The Fundamental Principles of Reproducibility. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 379, Article ID: 20200210.
https://doi.org/10.1098/rsta.2020.0210
[4]  Nissen, S.B., Magidson, T., Gross, K. and Bergstrom, C.T. (2016) Publication Bias and the Canonization of False Facts. eLife, 5, e21451.
https://doi.org/10.7554/elife.21451
[5]  Landis, S.C., Amara, S.G., Asadullah, K., Austin, C.P., Blumenstein, R., Bradley, E.W., et al. (2012) A Call for Transparent Reporting to Optimize the Predictive Value of Preclinical Research. Nature, 490, 187-191.
https://doi.org/10.1038/nature11556
[6]  Pusztai, L., Hatzis, C. and Andre, F. (2013) Reproducibility of Research and Preclinical Validation: Problems and Solutions. Nature Reviews Clinical Oncology, 10, 720-724.
https://doi.org/10.1038/nrclinonc.2013.171
[7]  Rooney, A.A., Cooper, G.S., Jahnke, G.D., Lam, J., Morgan, R.L., Boyles, A.L., et al. (2016) How Credible Are the Study Results? Evaluating and Applying Internal Validity Tools to Literature-Based Assessments of Environmental Health Hazards. Environment International, 92, 617-629.
https://doi.org/10.1016/j.envint.2016.01.005
[8]  McNutt, M. (2014) Journals Unite for Reproducibility. Science, 346, 679-679.
https://doi.org/10.1126/science.aaa1724
[9]  Vazire, S. (2018) Implications of the Credibility Revolution for Productivity, Creativity, and Progress. Perspectives on Psychological Science, 13, 411-417.
https://doi.org/10.1177/1745691617751884
[10]  Engineering National Academies of Sciences (2019) Reproducibility and Replicability in Science. National Academies Press.
[11]  Grant, S., Wendt, K.E., Leadbeater, B.J., Supplee, L.H., Mayo-Wilson, E., Gardner, F., et al. (2022) Transparent, Open, and Reproducible Prevention Science. Prevention Science, 23, 701-722.
https://doi.org/10.1007/s11121-022-01336-w
[12]  Resnik, D.B. and Shamoo, A.E. (2016) Reproducibility and Research Integrity. Accountability in Research, 24, 116-123.
https://doi.org/10.1080/08989621.2016.1257387
[13]  Loeb, A. (2021) To Qualify as “Scientific”, Evidence Has to Be Reproducible.
https://www.scientificamerican.com/article/to-qualify-as-scientific-evidence-has-to-be-reproducible/
[14]  Stark, P.B. (2018) Before Reproducibility Must Come Preproducibility. Nature, 557, 613-613.
https://doi.org/10.1038/d41586-018-05256-0
[15]  Zheng, Z. (2021) Reasons, Challenges, and Some Tools for Doing Reproducible Transportation Research. Communications in Transportation Research, 1, Article ID: 100004.
https://doi.org/10.1016/j.commtr.2021.100004
[16]  Diaba-Nuhoho, P. and Amponsah-Offeh, M. (2021) Reproducibility and Research Integrity: The Role of Scientists and Institutions. BMC Research Notes, 14, Article No. 451.
https://doi.org/10.1186/s13104-021-05875-3
[17]  Begley, C.G. and Ellis, L.M. (2012) Raise Standards for Preclinical Cancer Research. Nature, 483, 531-533.
https://doi.org/10.1038/483531a
[18]  Popper, K.R. (2002) Conjectures and Refutations: The Growth of Scientific Knowledge. Routledge.
[19]  Nielsen, M.A. (2014) Reinventing Discovery: The New Era of Networked Science. Princeton University Press。
[20]  Munafò, M.R., Nosek, B.A., Bishop, D.V.M., Button, K.S., Chambers, C.D., Percie du Sert, N., et al. (2017) A Manifesto for Reproducible Science. Nature Human Behaviour, 1, Article No. 21.
https://doi.org/10.1038/s41562-016-0021
[21]  Saltelli, A. and Giampietro, M. (2017) What Is Wrong with Evidence Based Policy, and How Can It Be Improved? Futures, 91, 62-71.
https://doi.org/10.1016/j.futures.2016.11.012
[22]  Woelfle, M., Olliaro, P. and Todd, M.H. (2011) Open Science Is a Research Accelerator. Nature Chemistry, 3, 745-748.
https://doi.org/10.1038/nchem.1149
[23]  Nosek, B.A., Alter, G., Banks, G.C., Borsboom, D., Bowman, S.D., Breckler, S.J., et al. (2015) Promoting an Open Research Culture. Science, 348, 1422-1425.
https://doi.org/10.1126/science.aab2374
[24]  Kidwell, M.C., Lazarević, L.B., Baranski, E., Hardwicke, T.E., Piechowski, S., Falkenberg, L., et al. (2016) Badges to Acknowledge Open Practices: A Simple, Low-Cost, Effective Method for Increasing Transparency. PLOS Biology, 14, e1002456.
https://doi.org/10.1371/journal.pbio.1002456
[25]  McKiernan, E.C., Bourne, P.E., Brown, C.T., Buck, S., Kenall, A., Lin, J., et al. (2016) How Open Science Helps Researchers Succeed. eLife, 5, e16800.
https://doi.org/10.7554/elife.16800
[26]  Stodden, V., Seiler, J. and Ma, Z. (2018) An Empirical Analysis of Journal Policy Effectiveness for Computational Reproducibility. Proceedings of the National Academy of Sciences, 115, 2584-2589.
https://doi.org/10.1073/pnas.1708290115
[27]  Leonelli, S. (2016) Data-Centric Biology. University of Chicago Press.
https://doi.org/10.7208/chicago/9780226416502.001.0001
[28]  Hardwicke, T.E. and Ioannidis, J.P.A. (2018) Populating the Data Ark: An Attempt to Retrieve, Preserve, and Liberate Data from the Most Highly-Cited Psychology and Psychiatry Articles. PLOS ONE, 13, e0201856.
https://doi.org/10.1371/journal.pone.0201856
[29]  Peng, R.D. (2011) Reproducible Research in Computational Science. Science, 334, 1226-1227.
https://doi.org/10.1126/science.1213847
[30]  Figshare (2023) About Figshare.
https://knowledge.figshare.com/about
[31]  (2023) Mendeley Data.
https://data.mendeley.com/
[32]  Dryad (2023) Home-Publish and Preserve Your Data.
https://datadryad.org/stash
[33]  Harvard Dataverse (2020) For Researchers—Harvard Dataverse Support.
https://support.dataverse.harvard.edu/researchers
[34]  (2023) OSF: Open Science Framework.
https://osf.io/
[35]  Zenodo (2013) European Organization for Nuclear Research and Open AIRE.
https://www.zenodo.org/
[36]  ScienceDB (2023) Science Data Bank.
https://www.scidb.cn/en/introduction
[37]  McMaster University (2023) Open Access Data Repositories.
https://mira.mcmaster.ca/research/open-access-data-repositories
[38]  Piwowar, H.A., Day, R.S. and Fridsma, D.B. (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLOS ONE, 2, e308.
https://doi.org/10.1371/journal.pone.0000308
[39]  Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J., Appleton, G., Axton, M., Baak, A., et al. (2016) The FAIR Guiding Principles for Scientific Data Management and Stewardship. Scientific Data, 3, Article No. 160018.
https://doi.org/10.1038/sdata.2016.18
[40]  Michener, W.K. (2015) Ten Simple Rules for Creating a Good Data Management Plan. PLOS Computational Biology, 11, e1004525.
https://doi.org/10.1371/journal.pcbi.1004525
[41]  Simons, D.J. (2014) The Value of Direct Replication. Perspectives on Psychological Science, 9, 76-80.
https://doi.org/10.1177/1745691613514755
[42]  Baker, M. (2016) 1, 500 Scientists Lift the Lid on Reproducibility. Nature, 533, 452-454.
https://doi.org/10.1038/533452a
[43]  Matosin, N., Frank, E., Engel, M., Lum, J.S. and Newell, K.A. (2014) Negativity Towards Negative Results: A Discussion of the Disconnect between Scientific Worth and Scientific Culture. Disease Models & Mechanisms, 7, 171-173.
https://doi.org/10.1242/dmm.015123
[44]  Stodden, V., Guo, P. and Ma, Z. (2013) Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption by Journals. PLOS ONE, 8, e67111.
https://doi.org/10.1371/journal.pone.0067111
[45]  Stodden, V., McNutt, M., Bailey, D.H., Deelman, E., Gil, Y., Hanson, B., et al. (2016) Enhancing Reproducibility for Computational Methods. Science, 354, 1240-1241.
https://doi.org/10.1126/science.aah6168
[46]  Scacchi, W. (2010) The Future of Research in Free/Open Source Software Development. Proceedings of the FSE/SDP Workshop on Future of Software Engineering Research, Santa Fe New, 7-8 November 2010, 315-319.
[47]  Ince, D.C., Hatton, L. and Graham-Cumming, J. (2012) The Case for Open Computer Programs. Nature, 482, 485-488.
https://doi.org/10.1038/nature10836
[48]  Sonnenburg, S., Braun, M.L., Ong, C.S., et al. (2007) The Need for Open Source Software in Machine Learning. Journal of Machine Learning Research, 8, 2443-2466.
[49]  Blischak, J.D., Davenport, E.R. and Wilson, G. (2016) A Quick Introduction to Version Control with Git and Github. PLOS Computational Biology, 12, e1004668.
https://doi.org/10.1371/journal.pcbi.1004668
[50]  Ram, K. (2013) Git Can Facilitate Greater Reproducibility and Increased Transparency in Science. Source Code for Biology and Medicine, 8, Article No. 7.
https://doi.org/10.1186/1751-0473-8-7
[51]  Wilson, G., Bryan, J., Cranston, K., Kitzes, J., Nederbragt, L. and Teal, T.K. (2017) Good Enough Practices in Scientific Computing. PLOS Computational Biology, 13, e1005510.
https://doi.org/10.1371/journal.pcbi.1005510
[52]  (2017) Challenges in Irreproducible Research. Nature, 546, 583.
[53]  Teixeira da Silva, J.A. (2015) Negative Results: Negative Perceptions Limit Their Potential for Increasing Reproducibility. Journal of Negative Results in BioMedicine, 14, Article No. 12.
https://doi.org/10.1186/s12952-015-0033-9
[54]  Bos, N., Lodi, S., Meyer, M., et al. (2012) The Importance of Systems Thinking to Address Obesity. Nutrition Reviews, 75, 94-106.
[55]  Whitmire, A.L., Baldwin, M.K., Desselle, B.C., et al. (2016) Neuroscience Data Integration through the Brokering of Standards. Neuroinformatics, 14, 273-287.
[56]  Cuzzocrea, A. and Shahriar, H. (2017) Data Masking Techniques for Nosql Database Security: A Systematic Review. 2017 IEEE International Conference on Big Data (Big Data), Boston, 11-14 December 2017, 4467-4473.
https://doi.org/10.1109/bigdata.2017.8258486
[57]  Sarada, G., Abitha, N., Manikandan, G. and Sairam, N. (2015) A Few New Approaches for Data Masking. 2015 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2015], Nagercoil, 19-20 March 2015, 1-4.
https://doi.org/10.1109/iccpct.2015.7159301
[58]  Cobb, M. (2022) What Is Data Masking? Techniques, Types and Best Practices.
https://www.techtarget.com/searchsecurity/definition/data-masking
[59]  Mivule, K. (2013) Utilizing Noise Addition for Data Privacy, an Overview.
https://arxiv.org/pdf/1309.3958
[60]  He, J., Cai, L. and Guan, X. (2018) Preserving Data-Privacy with Added Noises: Optimal Estimation and Privacy Analysis. IEEE Transactions on Information Theory, 64, 5677-5690.
https://doi.org/10.1109/tit.2018.2842221
[61]  Kadampur, M.A. and Somayajulu, D. (2010) A Noise Addition Scheme in Decision Tree for Privacy Preserving Data Mining.
http://arxiv.org/abs/1001.3504.
[62]  Slijepčević, D., Henzl, M., Klausner, L.D., Dam, T., Kieseberg, P. and Zeppelzauer, M. (2021) k-Anonymity in Practice: How Generalisation and Suppression Affect Machine Learning Classifiers. Computers & Security, 111, Article ID: 102488.
https://doi.org/10.1016/j.cose.2021.102488
[63]  De Capitani di Vimercati, S., Foresti, S., Livraga, G. and Samarati. P. (2023) k-Anonymity: From Theory to Applications. Transactions on Data Privacy, 16, 25-49.
[64]  Templ, M., Meindl, B., Kowarik, A. and Gussenbauer, J. (2023) SdcMicro: Statistical Disclosure Control Methods for Anonymization of Data and Risk Estimation. R Package Version 5.7.5.
[65]  Fujita, T. (2022) Anonypy: Anonymization Library for Python. Protect the Privacy of Individuals.
https://github.com/glassonion1/anonypy
[66]  Machanavajjhala, A., Kifer, D., Gehrke, J. and Venkitasubramaniam, M. (2007) L-Diversity: Privacy beyond k-Anonymity. ACM Transactions on Knowledge Discovery from Data, 1, 3.
https://doi.org/10.1145/1217299.1217302
[67]  Dwork, C. (2008) Differential Privacy: A Survey of Results. In: Agrawal, M., Du, D., Duan, Z. and Li, A., Eds., Theory and Applications of Models of Computation. TAMC 2008, Springer, 1-19.
https://doi.org/10.1007/978-3-540-79228-4_1
[68]  Apple Differential Privacy Team (2017) Learning with Privacy at Scale.
https://machinelearning.apple.com/research/learning-with-privacy-at-scale
[69]  Wood, J. and Basulto-Elias, G. (2024) ERD-Builder: Entity Relationship Diagrams Builder, 2024. R Package Version 1.0.0.
https://CRAN.R-project.org/package=ERDbuilder
[70]  Birch, K., Cochrane, D. and Ward, C. (2021) Data as Asset? The Measurement, Governance, and Valuation of Digital Personal Data by Big Tech. Big Data & Society, 8.
https://doi.org/10.1177/20539517211017308
[71]  Asswad, J. and Marx Gómez, J. (2021) Data Ownership: A Survey. Information, 12, Article 465.
https://doi.org/10.3390/info12110465
[72]  Padova, Y. (2021) Data Ownership versus Data Sharing: And What about Privacy? Lex Electronica, 26, 38-73.
https://heinonline.org/HOL/P?h=hein.journals/lexel26i=33
[73]  Bezuidenhout, L.M., Leonelli, S., Kelly, A.H. and Rappert, B. (2017) Beyond the Digital Divide: Towards a Situated Approach to Open Data. Science and Public Policy, 44, 464-475.
https://doi.org/10.1093/scipol/scw036
[74]  Leek, J.T. and Peng, R.D. (2015) Reproducible Research Can Still Be Wrong: Adopting a Prevention Approach. Proceedings of the National Academy of Sciences of the United States of America, 112, 1645-1646.
https://doi.org/10.1073/pnas.1421412111
[75]  Hartley, M. and Olsson, T.S.G. (2020) Dtoolai: Reproducibility for Deep Learning. Patterns, 1, Article ID: 100073.
https://doi.org/10.1016/j.patter.2020.100073
[76]  Alahmari, S.S., Goldgof, D.B., Mouton, P.R. and Hall, L.O. (2020) Challenges for the Repeatability of Deep Learning Models. IEEE Access, 8, 211860-211868.
https://doi.org/10.1109/access.2020.3039833
[77]  Perez-Riverol, Y., Gatto, L., Wang, R., Sachsenberg, T., Uszkoreit, J., Leprevost, F.d.V., et al. (2016) Ten Simple Rules for Taking Advantage of Git and Github. PLOS Computational Biology, 12, e1004947.
https://doi.org/10.1371/journal.pcbi.1004947
[78]  Prlić, A. and Procter, J.B. (2012) Ten Simple Rules for the Open Development of Scientific Software. PLOS Computational Biology, 8, e1002802.
https://doi.org/10.1371/journal.pcbi.1002802
[79]  Bokhari, E. and Hubert, L. (2018) The Lack of Cross-Validation Can Lead to Inflated Results and Spurious Conclusions: A Re-Analysis of the Macarthur Violence Risk Assessment Study. Journal of Classification, 35, 147-171.
https://doi.org/10.1007/s00357-018-9252-3
[80]  Teixeira da Silva, J.A. (2013) The Need for Post-Publication Peer Review in Plant Science Publishing. Frontiers in Plant Science, 4, Article 485.
https://doi.org/10.3389/fpls.2013.00485
[81]  Michener, W.K. (2015) Ecological data sharing. Ecological Informatics, 29, 33-44.
https://doi.org/10.1016/j.ecoinf.2015.06.010
[82]  U.S. Department of Transportation (2022) CMF Clearinghouse.
https://www.cmfclearinghouse.org/sqr.php

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133