With the method of text mining, this paper takes the related data of Rodong Sinmun in the recent ten years as the research object, extracts the hot topics and carries on the trend analysis. With the special attribute of his speech media, this paper analyzes the political issue. By extracting nearly one million news text data, their topic content is analyzed, combining with LDA topic model, and using K-means clustering algorithm. Aiming at the limitations of the traditional K-means algorithm, it is solved on the pre-built big data analysis platform, and the structure and content of the theme are analyzed in detail. In the end, the political theme and the trend of public opinion in recent years are derived. In terms of application, it is of great significance to analyze and study Korean big data text.
Cite this paper
Li, H. and Jin, Z. (2019). Research on Political Trend of North Korea Based on Big Data Text Mining Method. Open Access Library Journal, 6, e5893. doi: http://dx.doi.org/10.4236/oalib.1105893.
Likas, A., Vlassis, N.J. and Verbeek, J. (2003) The Global K-Means Clustering Algorithm. Pattern Recognition, 36, 451-461. https://doi.org/10.1016/S0031-3203(02)00060-2
Noh, Y., Kim, T., Jeong, D.-K. and Lee, K. (2019) Trend Analysis of Convergence Research Based on Social Big Data. Journal of The Korea Contents Association, 19, 135-146.
Kim, M., Koo, C. and Sohn, B. (2019) A Study on the Effectiveness of Education Welfare Priority Support Program through Text Mining. Korean Journal of Youth Studies, 26, 313-332. https://doi.org/10.21509/KJYS.2019.02.26.2.313
Dhillon, I.S. and Modha, D.S. (2001) Concept Decompositions for Large Sparse Text Data Using Clustering. Machine Learning, 42, 143-175. https://doi.org/10.1023/A:1007612920971
Wang, Y. and Tang, J. (2014) High-Efficiency K-Means Optimal Clustering Number Determination Algorithm. Journal of Computer Applications, 34, 1331-1335.
Wang, C.L. and Zhang, J.X. (2014) Ap-plication of Improved K-Means Algorithm Based on LDA in Text Clustering. Journal of Computer Applications, 34, 249-254.