Efficient data management in healthcare is essential for providing timely and accurate patient care, yet traditional partitioning methods in relational databases often struggle with the high volume, heterogeneity, and regulatory complexity of healthcare data. This research introduces a tailored partitioning strategy leveraging the MD5 hashing algorithm to enhance data insertion, query performance, and load balancing in healthcare systems. By applying a consistent hash function to patient IDs, our approach achieves uniform distribution of records across partitions, optimizing retrieval paths and reducing access latency while ensuring data integrity and compliance. We evaluated the method through experiments focusing on partitioning efficiency, scalability, and fault tolerance. The partitioning efficiency analysis compared our MD5-based approach with standard round-robin methods, measuring insertion times, query latency, and data distribution balance. Scalability tests assessed system performance across increasing dataset sizes and varying partition counts, while fault tolerance experiments examined data integrity and retrieval performance under simulated partition failures. The experimental results demonstrate that the MD5-based partitioning strategy significantly reduces query retrieval times by optimizing data access patterns, achieving up to X% better performance compared to round-robin methods. It also scales effectively with larger datasets, maintaining low latency and ensuring robust resilience under failure scenarios. This novel approach offers a scalable, efficient, and fault-tolerant solution for healthcare systems, facilitating faster clinical decision-making and improved patient care in complex data environments.
References
[1]
Raghupathi, W. and Raghupathi, V. (2014) Big Data Analytics in Healthcare: Promise and Potential. Health Information Science and Systems, 2, Article No. 3. https://doi.org/10.1186/2047-2501-2-3
[2]
Schneeweiss, S. (2014) Learning from Big Health Care Data. New England Journal of Medicine, 370, 2161-2163. https://doi.org/10.1056/nejmp1401111
[3]
Olaronke, I. and Oluwaseun, O. (2016) Big Data in Healthcare: Prospects, Challenges and Resolutions. 2016 Future Technologies Conference (FTC), San Francisco, 6-7 December 2016, 1152-1157. https://doi.org/10.1109/ftc.2016.7821747
[4]
Public Health Law (2024) Health Insurance Portability and Accountability Act of 1996 (HIPAA). https://www.cdc.gov/phlp/php/resources/health-insurance-portability-and-accountability-act-of-1996-hipaa.html
[5]
Legal Text (2024) General Data Protection Regulation (GDPR). https://gdpr-info.eu/
[6]
Bélanger, F. and Crossler, R.E. (2011) Privacy in the Digital Age: A Review of Information Privacy Research in Information Systems. MIS Quarterly, 35, 1017-1041. https://doi.org/10.2307/41409971
[7]
DeVries, W.T. (2003) Protecting Privacy in the Digital Age. Berkeley Technology Law Journal, 18, 283-311.
[8]
Kostkova, P. (2015) Grand Challenges in Digital Health. Frontiers in Public Health, 3, Article 134. https://doi.org/10.3389/fpubh.2015.00134
[9]
El Aboudi, N. and Benhlima, L. (2018) Big Data Management for Healthcare Systems: Architecture, Requirements, and Implementation. Advances in Bioinformatics, 2018, Article 4059018. https://doi.org/10.1155/2018/4059018
[10]
Abouelmehdi, K., Beni-Hessane, A. and Khaloufi, H. (2018) Big Healthcare Data: Preserving Security and Privacy. Journal of Big Data, 5, Article No. 1. https://doi.org/10.1186/s40537-017-0110-7
[11]
Price, W.N. and Cohen, I.G. (2019) Privacy in the Age of Medical Big Data. Nature Medicine, 25, 37-43. https://doi.org/10.1038/s41591-018-0272-7
[12]
Das, L., Kumar, A., Sharma, V., Bhatnagar, D., Singh, S. and Tripathi, N. (2024) Data-driven Healthcare Management, Analysis, and Future Trends. 2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE), Gautam Buddha Nagar, 9-11 May 2024, 1692-1698. https://doi.org/10.1109/ic3se62002.2024.10593217
[13]
Dash, S., Shakyawar, S.K., Sharma, M. and Kaushik, S. (2019) Big Data in Healthcare: Management, Analysis and Future Prospects. Journal of Big Data, 6, Article No. 54. https://doi.org/10.1186/s40537-019-0217-0
[14]
Campbell, R.J. (2004) Database Design: What HIM Professionals Need to Know. Perspectives in Health Information Management, 1, 6.
[15]
Wang, X., Williams, C., Liu, Z.H. and Croghan, J. (2017) Big Data Management Challenges in Health Research—A Literature Review. Briefings in Bioinformatics, 20, 156-167. https://doi.org/10.1093/bib/bbx086
[16]
Kantode, V., Sharma, R., Singh, S., Ankar, R. and Gujar, S. (2022) Big-Data in Healthcare Management and Analysis: A Review Article. 2022 3rd International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, 17-19 August 2022, 1139-1143. https://doi.org/10.1109/icesc54411.2022.9885609
[17]
Data Partitioning Guidance. https://learn.microsoft.com/en-us/azure/architecture/best-practices/data-partitioning
[18]
Thantilage, R.D., Le-Khac, N. and Kechadi, M. (2023) Healthcare Data Security and Privacy in Data Warehouse Architectures. Informatics in Medicine Unlocked, 39, Article 101270. https://doi.org/10.1016/j.imu.2023.101270
[19]
IBM (2022) What Is Encryption? https://www.ibm.com/topics/encryption
[20]
IBM (2024) Cryptography Use Cases: From Secure Communication to Data Security. https://www.ibm.com/think/topics/cryptography-use-cases
[21]
Canim, M., Kantarcioglu, M. and Malin, B. (2012) Secure Management of Biomedical Data with Cryptographic Hardware. IEEE Transactions on Information Technology in Biomedicine, 16, 166-175. https://doi.org/10.1109/titb.2011.2171701
[22]
Ge, Y., Wang, H., Bertino, E., Zhan, Z., Cao, J., Zhang, Y., et al. (2024) Evolutionary Dynamic Database Partitioning Optimization for Privacy and Utility. IEEE Transactions on Dependable and Secure Computing, 21, 2296-2311. https://doi.org/10.1109/tdsc.2023.3302284
[23]
Chawla, T., Singh, G., Pilli, E.S. and Govil, M.C. (2020) Storage, Partitioning, Indexing and Retrieval in Big RDF Frameworks: A Survey. Computer Science Review, 38, Article 100309. https://doi.org/10.1016/j.cosrev.2020.100309
[24]
Ahmed, T. and Sarma, M. (2019) Hash-Based Space Partitioning Approach to Iris Biometric Data Indexing. Expert Systems with Applications, 134, 1-13. https://doi.org/10.1016/j.eswa.2019.05.026
[25]
Salgova, V. and Matiasko, K. (2020) Reducing Data Access Time Using Table Partitioning Techniques. 2020 18th International Conference on Emerging eLearning Technologies and Applications (ICETA), Košice, 12-13 November 2020, 564-569. https://doi.org/10.1109/iceta51985.2020.9379231
[26]
Sun, C., Dai, H., Liu, H., Chen, T.Y. and Cai, K. (2019) Adaptive Partition Testing. IEEE Transactions on Computers, 68, 157-169. https://doi.org/10.1109/tc.2018.2866040
[27]
Mohammed Ali, A. and Kadhim Farhan, A. (2020) A Novel Improvement with an Effective Expansion to Enhance the MD5 Hash Function for Verification of a Secure E-document. IEEE Access, 8, 80290-80304. https://doi.org/10.1109/access.2020.2989050
[28]
Dolmeta, A., Martina, M. and Masera, G. (2023) Comparative Study of Keccak SHA-3 Implementations. Cryptography, 7, Article 60. https://doi.org/10.3390/cryptography7040060