全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于深度学习的语义级中文自动校对方法
A Semantic Level Chinese Automatic Proofreading Method Based on Deep Learning

DOI: 10.12677/CSA.2023.137135, PP. 1373-1381

Keywords: 深度学习,中文语法纠错,Seq2Seq,预训练语言模型
Deep Learning
, Chinese Grammatical Error Correction, Seq2Seq, Pre-Trained Language Models

Full-Text   Cite this paper   Add to My Lib

Abstract:

中文语法纠错任务是检查和纠正句子中的语法错误,相对于中文拼写错误纠正,中文语法错误纠正面对的错误不仅包括同音字和同形字的错误,还包括多字和少字的情况。本文通过大量的实验验证不同方法的优缺点,基于规则的方法需要消耗大量的人力来构造规则,而基于传统机器学习的方法面临特征提取能力不足的缺点,基于深度学习的方法是目前语法纠错的主要方法,因为语法纠错的文本存在不确定性,所以纠错的结果可能存在多种可能,因此Seq2Seq和预训练语言模型目前取得了较好的效果。
The task of Chinese grammar error correction is to check and correct grammatical errors in sentences. Compared with Chinese spelling error correction, Chinese grammar error correction not only includes homophone and homomorphic errors, but also includes redundant and missing characters. This paper verifies the advantages and disadvantages of different methods through a large number of experiments. Rule-based methods need to consume a lot of manpower to construct rules, while traditional machine learn-based methods face the disadvantage of insufficient feature extraction ability. Deep learn-based methods are the main methods for grammar error correction at present. Because there is uncertainty in the text of syntax correction, the result of error correction may have a variety of possible results, so Seq2Seq and the pretrained language model have achieved good results.

References

[1]  冯雅. 基于深度学习的中文语法纠错研究[D]: [硕士学位论文]. 上海: 上海师范大学, 2022.
https://doi.org/10.27312/d.cnki.gshsu.2022.000464
[2]  赵国红. 中文语法纠错方法的研究综述[J]. 现代计算机, 2021, 27(28): 65-69.
[3]  郭琰, 张矛. 基于深度学习的语法纠错算法建模研究[J]. 信息技术, 2021(4): 148-152, 158.
https://doi.org/10.13274/j.cnki.hdzj.2021.04.027
[4]  Yu, J.J. and Li, Z.H. (2014) Chinese Spelling Error Detec-tion and Correction Based on Language Model, Pronunciation, and Shape. Proceedings of the Third CIPS-SIGHAN Joint Conference on Chinese Language Processing, Wuhan, 20-21 October 2014, 220-223.
[5]  Zhang, S.Y., Xiong, J.H., Hou, J.P., Zhang, Q. and Cheng, X.Q. (2015) Hanspeller++: Aunified Framework for Chinese Spelling Correction. Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing, Beijing, 30-31 July 2015, 38-45.
https://doi.org/10.18653/v1/W15-3107
[6]  Wang, D.M., Song, Y., Li, J., Han, J.L. and Zhang, H.S. (2018) A Hybrid Approach to Auto-Matic Corpus Generation for Chinese Spelling Check. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, 31 October-4 November 2018, 2517-2527.
https://doi.org/10.18653/v1/D18-1273
[7]  Zhang, L., Zhou, M. and Pan, H.H. (2018) Automatic Detect-ing/Correcting Errors in Chinese Text by an Approximate Word-Matching Algorithm. Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, October 2000, 248-254.
https://doi.org/10.3115/1075218.1075250
[8]  Zhao, J.B., Li, M.Z., Liu, W.J., Li, S. and Lin, Z.Q. (2018) Detec-tion of Chinese Grammatical Errors with Context Representation. 2018 International Conference on Network Infrastruc-ture and Digital Content (IC-NIDC), Guiyang, 22-24 August 2018, 25-29.
https://doi.org/10.1109/ICNIDC.2018.8525629
[9]  Wang, D.M., Tay, Y. and Zhong, L. (2019) Confusion-set-Guided Pointer Networks for Chinese Spelling Check. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, July 2019, 5780-5785.
https://doi.org/10.18653/v1/P19-1578
[10]  Wu, K., Gao, Z., Peng, C. and Wen, X. (2013) Text Window Denoising Autoencoder: Building Deep Architecture for Chinese Word Segmentation. In: Zhou, G., Li, J., Zhao, D. and Feng, Y., Eds., NLPCC 2013: Natural Language Processing and Chinese Computing, Springer, Berlin, 1-12.
https://doi.org/10.1007/978-3-642-41644-6_1
[11]  Chen, J.W., Sigalingging, X.K., Leu, J.S. and Takada, J.I. (2020) Applying a Hybrid Sequential Model to Chinese Sentence Correction. Symmetry, 12, Article 1939.
https://doi.org/10.3390/sym12121939
[12]  Zhang, R., Zhang, Y., Huang, G. and Chen, R. (2021) Research on Proofreading Method of Semantic Collocation Error in Chinese. In: Sun, X., Zhang, X., Xia, Z. and Bertino, E., Eds., ICAIS 2021: Advances in Artificial Intelligence and Security, Springer, Cham, 709-722.
https://doi.org/10.1007/978-3-030-78615-1_62
[13]  Devlin, J., Chang, M.W., Lee, K. and Toutanova, K. (2018) Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv: 1810.04805.
[14]  Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. Advances in Neural Information Processing Sys-tems, 30, 5998-6008.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133