All Title Author
Keywords Abstract


A State of Art Analysis of Telecommunication Data by k-Means and k-Medoids Clustering Algorithms

DOI: 10.4236/jcc.2018.61019, PP. 190-202

Keywords: k-Means Algorithm, k-Medoids Algorithm, Data Clustering, Time Complexity, Telecommunication Data

Full-Text   Cite this paper   Add to My Lib

Abstract:

Cluster analysis is one of the major data analysis methods widely used for many practical applications in emerging areas of data mining. A good clustering method will produce high quality clusters with high intra-cluster similarity and low inter-cluster similarity. Clustering techniques are applied in different domains to predict future trends of available data and its uses for the real world. This research work is carried out to find the performance of two of the most delegated, partition based clustering algorithms namely k-Means and k-Medoids. A state of art analysis of these two algorithms is implemented and performance is analyzed based on their clustering result quality by means of its execution time and other components. Telecommunication data is the source data for this analysis. The connection oriented broadband data is given as input to find the clustering quality of the algorithms. Distance between the server locations and their connection is considered for clustering. Execution time for each algorithm is analyzed and the results are compared with one another. Results found in comparison study are satisfactory for the chosen application.

References

[1]  Han, J. and Kamber, M. (2006) Data Mining: Concepts and Techniques. 2nd Edition, Morgan Kaufmann Publishers, New Delhi.
[2]  Jain, A.K., Murty, M.N. and Flynn, P.J. (1999) Data Clustering: A Review. ACM Computing Surveys, 31. https://doi.org/10.1145/331499.331504
[3]  Berkhin, P. (2002) Survey of Clustering Data Mining Techniques. Technical Report, Accrue Software, Inc.
[4]  Hartigan, J.A. (1975) Clustering Algorithms. Wiley Publishers.
[5]  Bradley. P.S., Fayyad, U.M. and Reina, C.A. (1998) Scaling Clustering Algorithms to Large Databases. Proceedings of the 4th International Conference on Knowledge Discovery & Data Mining, AAAI Press, Menlo Park, CA, 9-15 .
[6]  Bhukya, D.P., Ramachandram, S. and Reeta Sony, A.L. (2010) Performance Evaluation of Partition Based Clustering Algorithms in Grid Environment Using Design of Experiments. International Journal of Reviews in Computing, 4, 46-53.
[7]  Leiva-Valdebenito, S.A. and Torres-Aviles, F.J. (2010) A Review of the Most Common Partition Algorithms in Cluster Analysis: A Comparative Study. Colombian Journal of Statistics, 33, 321-339.
[8]  Napoleon, D. and Ganga Lakshmi, P. (2010) An Enhanced k-Means Algorithm to Improve the Efficiency Using Normal Distribution Data Points. Int. Journal on Computer Science and Engineering, 2, 2409-2413.
[9]  Benderskaya. E.N. and Zhukova, S.V. (2011) Self-Organized Clustering and Classification: A Unified Approach via Distributed Chaotic Computing. International Symposium on Distributed Computing and Artificial Intelligence, Advances in Intelligent and Soft Computing, 91, 423-431.
[10]  Shanmugam, N., Suryanarayana, A.B., Chandrashekar, S.D. and Manjunath, C.N. (2011) A Novel Approach to Medical Image Segmentation. Journal of Computer Science, 7, 657-663. https://doi.org/10.3844/jcssp.2011.657.663
[11]  Kim, S.S. and Yang, S.O. (2011) Wireless Sensor Gathering Data during Long Time. Telecommunication Systems.
[12]  Jain, A.K. and Dubes, R.C. (1988) Algorithms for Clustering Data. Prentice Hall Inc., Englewood Cliffs, New Jersey.
[13]  Velmurugan, T. and Santhanam, T. (2010) Computational Complexity between K-Means and K-Medoids Clustering Algorithms for Normal and Uniform Distributions of Data Points. Journal of Computer Science, 6, 363-368. https://doi.org/10.3844/jcssp.2010.363.368
[14]  Rakhlin, A. and Caponnetto, A. (2007) Stability of k-Means Clustering. Advances in Neural Information Processing Systems, 12, 216-222.
[15]  Dharmarajan, A. and Velmurugan, T. (2016) Effi-ciency of k-Means and k-Medoids Clustering Algorithms Using Lung Cancer Dataset. Int. Journal of Data Mining Techniques and Applications, 5, 150-156. https://doi.org/10.20894/IJDMTA.102.005.002.011
[16]  Velmurugan, T. (2012) Efficiency of k-Means and k-Medoids Algorithms for Clustering Arbitrary Data Points. Int. J. Computer Technology & Applications, 3, 1758-1764.
[17]  Park, H.S., Lee, J.S. and Jun, C.H. (2009) A k-Means-Like Algorithm for k-Medoids Clustering and Its Performance. Department of Industrial and Management Engineering, POSTECH, South Korea.
[18]  Yu, Y.Q., Xin, W., Liu, G.N., Li, H., Li, P. and Lin, H. (2017) A Combinatorial Clustering Method for Sequential Fraud Detection. IEEE International Conference on Service Systems and Service Management, 1-6.
[19]  Vijayakumar, M. and Parvathi, R.M.S. (2010) Concept Mining of High Volume Data Streams in Network Traffic Using Hierarchical Clustering. European Journal of Scientific Research, 39, 234-242.

Full-Text

comments powered by Disqus