In recent years, deep learning technology has been widely used and developed. In natural language processing tasks, pre-training models have been more widely used. Whether it is sentence extraction or sentiment analysis of text, the pre-training model plays a very important role. The use of a large-scale corpus for unsupervised pre-training of models has proven to be an excellent and effective way to provide models. This article summarizes the existing pre-training models and sorts out the improved models and processing methods of the relatively new pre-training models, and finally summarizes the challenges and prospects of the current pre-training models.
Cite this paper
Xiao, Y. and Jin, Z. (2021). Summary of Research Methods on Pre-Training Models of Natural Language Processing. Open Access Library Journal, 8, e7602. doi: http://dx.doi.org/10.4236/oalib.1107602.
Lyu, L.C., Zhang, B., Wang, Y.P., Zhao, Y.J., Qian, L. and Li, T.T. (2021) Global Patent Analysis of Natural Language Processing. Science Focus, 16, 84-95.
Yu, T.R., Jin, R., Han, X.Z., Li, J.H. and Yu, T. (2020) Review of Pre-Treaning Models for Natural Language Processing. Computer Engineering and Applications, 56, 12-22.
Radford, A., Narasimhan, K., Salimans, T., et al. (2020) Improving Language Understanding by Generative Pretraning.
https://www.cs.ubc.ca/~amuham01/LING530/papers/redford2018improving.pdf
Martin, L., Muller, B., Suarez, P.J.O., et al. (2019) CamemBERT: A Tasty French Language Model. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 7203-7219. arXiv:1911.034894
https://doi.org/10.18653/v1/2020.acl-main.645
Alsentzer, E., Murphy, J.R., Boag, W., et al. (2019) Publicly Available Clinical BERT Embeddings. Proceedings of the 2nd Clinical Natural Language Processing Workshop, 72-78. arXiv:1904.03323
https://doi.org/10.18653/v1/W19-1909
Shoeybi, M., Pateary, M., Puri, R., et al. (2019) Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Paralleslism. arXiv:1909.08053