The Exploration of the Approach to Data Preparation for Chinese Text Analysis Based on R Language

doi:10.4236/oalib.1107821

OALib Journal期刊
ISSN: 2333-9721
费用：99美元

查看量	下载量

Open Access Library Journal 8 2021

查看所有领域

The Exploration of the Approach to Data Preparation for Chinese Text Analysis Based on R Language

DOI: 10.4236/oalib.1107821, PP. 1-8

Jiang Li

Subject Areas: Big Data Search and Mining, Complex network models

Keywords: Data Preparation, Text Analysis, R Language, Chinese Text Segmentation

Full-Text Cite this paper Add to My Lib

Abstract

This paper explores how to prepare data for analyzing the Chinese texts with R language based on the theory of Welbers, particularly comparing the R package Rwordseg with jiebaR to see the results of Chinese text segmentation at the step of preprocessing.

Cite this paper

Li, J. (2021). The Exploration of the Approach to Data Preparation for Chinese Text Analysis Based on R Language. Open Access Library Journal, 8, e07821. doi: http://dx.doi.org/10.4236/oalib.1107821.

References

[1]	王家钺. 基于R的语言学统计方法[M]. 北京: 外语教学与研究出版社, 2019: 26-130.
[2]	Welbers, K., Van Atteveldt, W. and Benoit, K. (2017) Text Analysis in R. Communication Methods and Measures, 11, 245-265. https://doi.org/10.1080/19312458.2017.1387238
[3]	R Core Team (2020) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.r-project.org/
[4]	王建红, 冉莹雪. 《资本论》中的“性情马克思”——基于R 语言syuzhet 安装包的文本情感分析[J]. 海南广播电视大学学报, 2002, 79(2): 31-37.
[5]	孟诗琼, 孟诗瑶, 尹志. 基于R语言的汽车消费数据挖掘及可视化方法[J]. 宁波工程学院学报, 2015, 27(4): 17-23.
[6]	朱昶胜, 孙欣, 冯文芳. 基于R语言的网络舆情对股市影响研究[J]. 兰州理工大学学报, 2018, 44(4): 103-108.
[7]	唐琳, 郭崇慧, 陈静锋. 中文分词技术研究综述[J]. 数据分析与知识发现, 2020, 4(2/3): 1-17.
[8]	梁喜涛, 顾磊. 中文分词与词性标注研究[J]. 计算机技术与发展, 2015, 25(2): 175-180.
[9]	Li, J. (2019) Rwordseg: Chinese Word Segmentation. R Package, Version 0.3-2. https://CRAN.R-project.org/package=Rwordseg
[10]	Qin, W. and Wu, Y. (2019) jiebaR: Chinese Text Segmentation. R Package, Version 0.11. https://CRAN.R-project.org/package=jiebaR
[11]	吴丹露, 魏彤, 许家清. R语言环境下的文本可视化及主题分析——以社会服务平台数据为例[J]. 宁波工程学院学报, 2015, 27(1): 19-25.
[12]	杨杰. 民法典正式施行！婚姻法继承法合同法等废止, 2021年你的生活将有这些大不同[EB/OL]. https://news.sina.com.cn/c/2021-01-01/doc-iiznctke9619275.shtml, 2021-01-01.
[13]	曹雪芹著; 程伟元, 高鹗整理; 启功等评注. 红楼梦: 程乙本校注版[M]. 桂林: 广西师范大学出版社, 2017: 1-17.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133