The amount of data that is traveling across the internet today, including
very large and complex set of raw facts that are not only large, but also,
complex, noisy, heterogeneous, and longitudinal data as well. Companies,
institutions, healthcare system, mobile application capturing devices and
sensors, traffic management, banking, retail, education etc., use piles of data
which are further used for creating reports in order to ensure continuity
regarding the services that they have to offer. Recently, Big data is one of
the most important topics in IT industry. Managing Big data needs new
techniques because traditional security and privacy mechanisms are inadequate
and unable to manage complex distributed computing for different types of data.
New types of data have different and new challenges also. A lot of researches
treat with big data challenges starting from Doug Laney’s landmark paper, during the previous two
decades; the big challenge is how to operate a huge volume of data that has to
be securely delivered through the internet and reach its destination intact.
The present paper highlights important concepts of Fifty-six Big Data V’s
characteristics. This paper also highlights the security and privacy Challenges
that Big Data faces and solving this problem by proposed technological
solutions that help us avoiding these challenging problems.
References
[1]
Hussien, A.A. (2020) How Many Old and New Big Data V’s Characteristics, Processing Technology, and Applications (BD1). International Journal of Application or Innovation in Engineering & Management, 9, 15-27. http://www.ijaiem.org/
[2]
Priyadarshy, S. (2015) The 7 Pillars of Big Data. Petrolium Review, 14 January.
[3]
Trifu, M.R. and Ivan, M.L. (2014) Big Data: Present and Future. Database Systems Journal, 1, No. 1.
[4]
Firican, G. (2017) The 10 V’S BIG DATA. Work Paper, 8 February.
[5]
Borne, K. (2014) Top 10 Big Data Challenges—A Serious Look at 10 Big Data V’s. Blog Post, 11 April.
[6]
Vorhies, W. (2014) How Many V’S in Big Data. View Blog Work Paper, 31 October. http://www.aimspress.com/journal/Math
[7]
Dhamodharavadhani, S. and Rajasekaran, G. (2018) Unlock Different V’s of Big Data for Analytics. International Journal of Computer Sciences and Engineering, 6, Special Issue-4.
[8]
Dr. Darrin (2016). https://educationalresearchtechniques.wordpress.com/2016/05/02/characteristics-of-big-data/
Laney, D. (2012) http://blogs.gartner.com/doug-laney/deja-vvvue-others-claiming-gartners-volume-velocity-variety-construct-for-big-data/
[14]
Cartledge, C. (2016) How Many vs Are There in Big Data? Working Paper, 18 February.
[15]
Borne, D. (2014). https://www.mapr.com/blog/top-10-big-data-challenges-serious-look-10-big-data-vs
[16]
Sivarajah, U., Kamal, M.M., Irani, Z. and Weerakkody, V. (2017) Critical Analysis of Big Data Challenges and Analytical Methods. Journal of Business Research, 70, 263-286. https://doi.org/10.1016/j.jbusres.2016.08.001
[17]
Boyd, D. and Crawford, K. (2012) Critical Questions for Big Data: Provocations for a Cultural, Technological, and Scholarly Phenomenon. Information, Communication & Society, 15, 662-679. https://doi.org/10.1080/1369118X.2012.678878
[18]
Venkatraman, S. and Venkatraman, R. (2019) Big Data Security Challenges and Strategies. AIMS Mathematics, 4, 860-879. https://doi.org/10.3934/math.2019.3.860
[19]
Hargittai, E. (2013) Is Bigger Always Better? Potential Biases of Big Data Derived from Social Network Sites. The ANNALS of the American Academy of Political and Social Science, 659, 63-76. https://doi.org/10.1177/0002716215570866
[20]
Mayer, K., et al. (2009) Computational Social Science. Schlüsselwerke der Netzwerkforschung ook.
[21]
Wang, Y.X. and Wiebe, V.J. (2014) Big Data Analytics on the Characteristic Equilibrium of Collective Opinions in Social Networks. International Journal of Cognitive informatics and Natural Intelligence, 8, Article No.: 3. https://doi.org/10.4018/IJCINI.2014070103
[22]
Akerkar, R. (2014) Big Data Computing. Taylor & Francis Group, CRC Press, New York. https://doi.org/10.1201/b16014
[23]
Zicari, R.V. (2014) Big Data: Challenges and Opportunities. Big Data Computing, 103-128.
[24]
Agrawal, R. and Nyamful, C. (2016) Challenges of Big Data Storage and Management. Global Journal of Information Technology, 6, 1-10. http://sproc.org/ojs/index.php/gjit https://doi.org/10.18844/gjit.v6i1.383
[25]
Tarekegn, G.B. and Munaye, Y.Y. (2016) Big Data: Security Issues, Challenges and Future Scope. International Journal of Computer Engineering & Technology, 7, 12-24.
[26]
Cavanillas, J.M., Curry, E. and Wahlster, W. (2017) New Horizons for a Data-Driven Economy: A Roadmap for Usage and Exploitation of Big Data in Europe. Library of Congress Control Number: 2015951834. https://doi.org/10.1007/978-3-319-21569-3
[27]
Kumar, N., Vasilakos, A.V. and Rodrigues, J.J.P. (2017) A Multi-Tenant Cloud-Based DC Nano Grid for Self-Sustained Smart Buildings in Smart Cities. IEEE Communications Magazine, 55, 14-21. https://doi.org/10.1109/MCOM.2017.1600228CM
[28]
Yao, Z., Mark, P. and Rabbat, M. (2012) Anomaly Detection Using Proximity Graph and PageRank Algorithm. IEEE Transactions on Information Forensics and Security, 7, 1288-1300. https://doi.org/10.1109/TIFS.2012.2191963
[29]
Yan, Z., Ding, W., Niemi, V. and Vasilakos, A.V. (2016) Two Schemes of Privacy-Preserving Trust Evaluation. Future Generation Computer Systems, 62, 175-189. https://doi.org/10.1016/j.future.2015.11.006
[30]
Rebecca Webb (2018) 12 Challenges of Data Analytics and How to Fix Them. Risk management Blog-Clearrisk. https://www.clearrisk.com/risk-management-blog/challenges-of-data-analytics
[31]
Puthal, D., Nepal, S., Ranjan, R. and Chen, J.J. (2017) A Dynamic Prime Number Based Efficient Security Mechanism for Big Sensing Data Streams. Journal of Computer and System Sciences, 83, 22-42. https://doi.org/10.1016/j.jcss.2016.02.005
[32]
Akoglu, L., Tong, H.H. and Koutra, D. (2015) Graph Based Anomaly Detection and Description: A Survey. Data Mining and Knowledge Discovery, 29, 626-688. https://doi.org/10.1007/s10618-014-0365-y
[33]
Hasani, Z. and Krrabaj, S. (2019) Survey and Proposal of an Adaptive Anomaly Detection Algorithm for Periodic Data Streams. Journal of Computer and Communications, 7, 33-55. https://doi.org/10.4236/jcc.2019.78004
[34]
Hasani, Z., Jakimovski, B., Velinov, G. and Kon-Popovska, M. (2018) An Adaptive Anomaly Detection Algorithm for Periodic Real Time Data Streams. In: International Conference on Intelligent Data Engineering and Automated Learning, Springer, Berlin, 385-397. https://doi.org/10.1007/978-3-030-03493-1_41
[35]
Shahin, A.A. (2016) Using Multiple Seasonal Holt-Winters Exponential Smoothing to Predict Cloud Resource Provisioning. International Journal of Advanced Computer Science and Applications, 7, 91-96. https://doi.org/10.14569/IJACSA.2016.071113
[36]
Cortez, P., Rocha, M. and Neves, J. (2001) Genetic and Evolutionary Algorithms for Time Series Fore-Casting. In: Monostori, L., Váncza, J. and Ali, M., Eds., Engineering of Intelligent Systems, Lecture Notes in Computer Science, Vol. 2070, Springer, Berlin, Heidelberg, 393-402. https://doi.org/10.1007/3-540-45517-5_44
[37]
de Assis, M.V.O., Carvalho, L.F., Rodrigues, J.J.P.C. and Proença, M.L. (2013) Holt-Winters Statistical Forecasting and ACO Metaheuristic for Traffic Characterization. IEEE International Conference on Communications, Budapest, 9-13 June 2013, 2524-2528. https://doi.org/10.1109/ICC.2013.6654913
[38]
Scrucca, L. (2013) GA: A Package for Genetic Algorithms. Journal of Statistical Software, 53, 1-37. https://doi.org/10.18637/jss.v053.i04
[39]
NUMENTA, Anomaly Benchmark with Labeled Anomalies. https://github.com/numenta/NAB/tree/master/data/artificialWithAnomaly
[40]
Yahoo: S5-dA, Anomaly Detection Dataset, Version 1.0(16M). https://webscope.sandbox.yahoo.com/catalog.php?datatype=s%5c&did=70
[41]
Zhou, G.M., Zhang, D.X., Liu, Y.J., et al. (2015) A Novel Image Encryption Algorithm Based on Chaos and Line Map. Neurocomputing, 169, 150-157. https://doi.org/10.1016/j.neucom.2014.11.095
[42]
Wang, Z.W., Cao, C., Yang, N.H. and Chang, V. (2017) ABE with Improved Auxiliary Input for Big Data Security. Journal of Computer and System Sciences, 89, 41-50. https://doi.org/10.1016/j.jcss.2016.12.006
[43]
Kshetri, N. (2014) The Emerging Role of Big Data in Key Development Issues: Opportunities, Challenges, and Concerns. Big Data & Society, 1, 1-20. https://doi.org/10.1177/2053951714564227
[44]
Hsu, C., Zeng, B. and Zhang, M. (2014) A Novel Group Key Transfers for Big Data Security. Applied Mathematics and Computation, 249, 436-443. https://doi.org/10.1016/j.amc.2014.10.051
[45]
Kapil, G., Agrawal, A., Attaallah, A., Algarni, A., Kumar, R. and Khan, R.A. (2020) Attribute Based Honey Encryption Algorithm for Securing Big Data: Hadoop Distributed File System Perspective. PeerJ Computer Science, 6, e259. https://doi.org/10.7717/peerj-cs.259
[46]
Wu, X.D., Zhu, X.Q., Wu, G.-Q. and Ding, W. (2014) Data Mining with Big Data. IEEE Transactions on Knowledge and Data Engineering, 26, 97-107. https://doi.org/10.1109/TKDE.2013.109
[47]
Xiao, H., Biggio, B., Brown, G., et al. (2015) Is Feature Selection Secure against Training Data Poisoning? Proceedings of the 32nd International Conference on Machine Learning, Lille, 1689-1698.
[48]
Fuchs, G., Stange, H., Hecker, D., et al. (2015) Constructing Semantic Interpretation of Routine and Anomalous Mobility Behaviors from Big Data. SIGSPATIAL Special, 7, 27-34. https://doi.org/10.1145/2782759.2782765
[49]
Miller, B.A., Beard, M.S. and Bliss, N.T. (2011) Eigen space Analysis for Threat Detection in Social Networks. Proceedings of the 14th International Conference on Information Fusion, Chicago, 5-8 July 2011, 1-7.
[50]
Hota, S. (2018) Big Data Analysis on YouTube Using Hadoop And Mapreduce. International Journal of Computer Engineering in Research Trends, 5, 98-104.
[51]
Remya, G. and Mohan, A. (2015) Distributed Computing Based Methods for Anomaly Analysis in Large Datasets. International Journal of Advanced Research in Computer and Communication Engineering, 4, 427-430.
[52]
Breier, J. and Branišová, J. (2015) Anomaly Detection from Log Files Using Data Mining Techniques. Lecture Notes in Electrical Engineering, 339, 449-457. https://www.researchgate.net/publication/282923954 https://doi.org/10.1007/978-3-662-46578-3_53
[53]
Restuccia, F., Salvatore d’oro, Kanhere, S.S. and Melodia, T. (2018) Blockchain for the Internet of Things: Present and Future. IEEE Internet of Things Journal, 1, 1-8.
[54]
Christidis, K. and Devetsiokiotis, M. (2016) Blockchains and Smart Contracts for the Internet of Things. IEEE Access, 4, 2292-2303. https://doi.org/10.1109/ACCESS.2016.2566339
[55]
Yaga, D., Mell, P., Roby, N. and Scarfone, K. (2018) Blockchain Technology Overview. National Institute of Standards and Technology, U.S. Department of Commerce, 1-27. https://doi.org/10.6028/NIST.IR.8202
[56]
Uchibeke, U.U., Schneider, K.A., Kassani, S.H. and Deters, R. (2018) Blockchain Access Control Ecosystem for Big Data Security. 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), Halifax, 30 July-3 August 2018, 1373-1378.
[57]
Ramos, L.F.M. and Silva, J.M.C. (2019) Privacy and Data Protection Concerns Regarding the Use of Blockchains in Smart Cities. ICEGOV2019, Melbourne, 3-5 April 2019, 342-347.
[58]
Maull, R., Godsiff, P., Mulligan, C., Brown, A. and Kewell, B. (2017) Distributed Ledger Technology: Applications and Implications. Strategic Change, 26, No. 5. https://doi.org/10.1002/jsc.2148
[59]
Tasca, P. and Tessone, C.J. (2017) Taxonomy of Blockchain Technologies. Principles of Identification and Classification. https://ssrn.com/abstract=2977811
[60]
CNIL (2018) Solutions for a Responsible Use of the Blockchain in the Context of Personal Data. Technical Report. Commission Nationale Informatique & Libertés. https://www.cnil.fr/sites/default/files/atoms/files/blockchain.pdf