This paper explores how to prepare data for analyzing the Chinese texts with R language based on the theory of Welbers, particularly comparing the R package Rwordseg with jiebaR to see the results of Chinese text segmentation at the step of preprocessing.
Cite this paper
Li, J. (2021). The Exploration of the Approach to Data Preparation for Chinese Text Analysis Based on R Language. Open Access Library Journal, 8, e7821. doi: http://dx.doi.org/10.4236/oalib.1107821.
Welbers, K., Van Atteveldt, W. and Benoit, K. (2017) Text Analysis in R. Communication Methods and Measures, 11, 245-265.
https://doi.org/10.1080/19312458.2017.1387238
R Core Team (2020) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
https://www.r-project.org/