全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

融合DeBERTa模型与图卷积网络的文本分类方法研究
Research on Text Classification Methods by Fusing DeBERTa Model with Graph Convolutional Networks

DOI: 10.12677/airr.2024.134072, PP. 715-725

Keywords: 文本分类,预训练模型,图神经网络
Text Classification
, Pre-Trained Models, Graph Neural Network

Full-Text   Cite this paper   Add to My Lib

Abstract:

文本分类作为自然语言处理领域的一个核心任务,旨在实现对文本数据的自动化归类,使其对应到预先设定的类别之中。BertGCN模型结合了BERT和GCN两者的优势,从而能够有效地处理文本和图结构数据。然而,该模型在应对复杂的文本分类任务时,仍然存在一定的局限性。BERT使用绝对位置编码来表示每个词在序列中的位置,不能很好地捕捉句子中词语之间的相对关系,同时,BERT模型将词语的内容信息和位置信息结合在一起进行处理,可能导致模型难以区分这两种不同的信息。为了克服这些限制,我们提出了DeGraph-Net模型,通过引入DeBERTa模型,来提升文本分类的效果。DeBERTa使用相对位置编码,更好地表示词语间的相对位置关系。此外,DeBERTa将词语的内容信息和位置信息分开处理,避免了内容信息和位置信息的混淆,提高了模型分类的准确率。实验结果表明,DeGraph-Net模型在三个基准文本分类数据集上均取得了显著的性能提升,验证了该模型在复杂文本分类任务中的有效性。
Text classification is a core task in the field of natural language processing, which aims to automatically categorize text data into predefined categories. The BertGCN model combines the advantages of both BERT and GCN, enabling it to effectively handle both text and graph-structured data. However, there are still some limitations when it comes to handling complex text classification tasks. BERT uses absolute position encoding to represent the position of each word in a sequence, which may not effectively capture the relative relationships between words in a sentence. Additionally, by combining content and position information, the BERT model may struggle to differentiate between these two distinct types of information. To overcome these limitations, we propose the DeGraph-Net model. We enhance text classification performance by incorporating the DeBERTa model. DeBERTa uses relative position encoding, which better represents the relative positional relationships between words. Additionally, DeBERTa processes the content information and location information of words separately, preventing the confusion between these two types of data and improving the model’s classification accuracy. Experimental results demonstrate that the DeGraph-Net model achieves significant performance improvements on three benchmark text classification datasets, validating the model’s effectiveness in complex text classification tasks.

References

[1]  Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M. and Gao, J. (2021) Deep Learning-Based Text Classification: A Comprehensive Review. ACM Computing Surveys, 54, 1-40.
https://doi.org/10.1145/3439726
[2]  Zhang, C., Li, Q. and Song, D. (2019) Aspect-Based Sentiment Classification with Aspect-Specific Graph Convolutional Networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, November 2019, 4568-4578.
https://doi.org/10.18653/v1/d19-1464
[3]  Xiao, S., Liu, Z., Han, W., Zhang, J., Shao, Y., Lian, D., et al. (2022) Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval. Proceedings of the ACM Web Conference 2022, Lyon, 25-29 April 2022, 286-296.
https://doi.org/10.1145/3485447.3511957
[4]  Pang, B. and Lee, L. (2008) Opinion Mining and Sentiment Analysis. Foundations and Trends® in Information Retrieval, 2, 1-135.
https://doi.org/10.1561/1500000011
[5]  Kim, H., Howland, P., Park, H., et al. (2005) Dimension Reduction in Text Classification with Support Vector Machines. Journal of Machine Learning Research, 6, 37-53.
[6]  Fernández, J., Montañés, E., Díaz, I., Ranilla, J. and Combarro, E.F. (2004) Text Categorization by a Machine-Learning-Based Term Selection. In: Galindo, F., Takizawa, M. and Traunmüller, R., Eds., Database and Expert Systems Applications, Springer, Berlin, 253-262.
https://doi.org/10.1007/978-3-540-30075-5_25
[7]  Mladenić, D., Brank, J., Grobelnik, M. and Milic-Frayling, N. (2004) Feature Selection Using Linear Classifier Weights: Interaction with Classification Models. Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Sheffield, 25-29 July 2004, 234-241.
https://doi.org/10.1145/1008992.1009034
[8]  Tang, D., Qin, B. and Liu, T. (2015) Document Modeling with Gated Recurrent Neural Network for Sentiment Classification. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, September 2015, 1422-1432.
https://doi.org/10.18653/v1/d15-1167
[9]  Lai, S., Xu, L., Liu, K. and Zhao, J. (2015) Recurrent Convolutional Neural Networks for Text Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 29, 2267-2273.
https://doi.org/10.1609/aaai.v29i1.9513
[10]  Yang, Z., Yang, D., Dyer, C., He, X., Smola, A. and Hovy, E. (2016) Hierarchical Attention Networks for Document Classification. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, June 2016, 1480-1489.
https://doi.org/10.18653/v1/n16-1174
[11]  Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C. and Yu, P.S. (2021) A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems, 32, 4-24.
https://doi.org/10.1109/tnnls.2020.2978386
[12]  Zhou, J., Cui, G., Hu, S., Zhang, Z., Yang, C., Liu, Z., et al. (2020) Graph Neural Networks: A Review of Methods and Applications. AI Open, 1, 57-81.
https://doi.org/10.1016/j.aiopen.2021.01.001
[13]  Yao, L., Mao, C. and Luo, Y. (2019) Graph Convolutional Networks for Text Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 7370-7377.
https://doi.org/10.1609/aaai.v33i01.33017370
[14]  Zhang, Y., Yu, X., Cui, Z., Wu, S., Wen, Z. and Wang, L. (2020) Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, July 2020, 334-339.
https://doi.org/10.18653/v1/2020.acl-main.31
[15]  Tayal, K., Rao, N., Agarwal, S., Jia, X., Subbian, K. and Kumar, V. (2020) Regularized Graph Convolutional Networks for Short Text Classification. Proceedings of the 28th International Conference on Computational Linguistics: Industry Track, December 2020, 236-242.
https://doi.org/10.18653/v1/2020.coling-industry.22
[16]  Mikolov, T., Chen, K., Corrado, G., et al. (2013) Efficient Estimation of Word Representations in Vector Space.
[17]  Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, 4-9 December 2017, 6000-6010.
[18]  Devlin, J., Chang, M.W., Lee, K., et al. (2018) Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding.
[19]  Yang, Y. and Cui, X. (2021) Bert-Enhanced Text Graph Neural Network for Classification. Entropy, 23, Article No. 1536.
https://doi.org/10.3390/e23111536
[20]  Lv, S., Dong, J., Wang, C., Wang, X. and Bao, Z. (2024) RB-GAT: A Text Classification Model Based on RoBERTa-BiGRU with Graph Attention Network. Sensors, 24, Article No. 3365.
https://doi.org/10.3390/s24113365
[21]  Liu, Y., Ott, M., Goyal, N., et al. (2019) Roberta: A Robustly Optimized Bert Pretraining Approach.
[22]  Zhang, D., Tian, L., Hong, M., Han, F., Ren, Y. and Chen, Y. (2018) Combining Convolution Neural Network and Bidirectional Gated Recurrent Unit for Sentence Semantic Classification. IEEE Access, 6, 73750-73759.
https://doi.org/10.1109/access.2018.2882878
[23]  Veličković, P., Cucurull, G., Casanova, A., et al. (2017) Graph Attention Networks.
[24]  Lin, Y., Meng, Y., Sun, X., Han, Q., Kuang, K., Li, J., et al. (2021) Bertgcn: Transductive Text Classification by Combining GNN and Bert. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, August 2021, 1456-1462.
https://doi.org/10.18653/v1/2021.findings-acl.126
[25]  Kipf, T.N. and Welling, M. (2016) Semi-Supervised Classification with Graph Convolutional Networks.
[26]  Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N. and Huang, X. (2020) Pre-Trained Models for Natural Language Processing: A Survey. Science China Technological Sciences, 63, 1872-1897.
https://doi.org/10.1007/s11431-020-1647-3
[27]  Pennington, J., Socher, R. and Manning, C. (2014) Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, October 2014, 1532-1543.
https://doi.org/10.3115/v1/d14-1162
[28]  Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., et al. (2018) Deep Contextualized Word Representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, 2227-2237.
https://doi.org/10.18653/v1/n18-1202
[29]  Graves, A. (2012) Long Short-Term Memory. In: Graves, A., Ed., Supervised Sequence Labelling with Recurrent Neural Networks, Springer, Berlin, 37-45.
https://doi.org/10.1007/978-3-642-24797-2_4
[30]  Howard, J. and Ruder, S. (2018) Universal Language Model Fine-Tuning for Text Classification. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Volume 1, 328-339.
https://doi.org/10.18653/v1/p18-1031
[31]  Bahdanau, D., Cho, K. and Bengio, Y. (2014) Neural Machine Translation by Jointly Learning to Align and Translate.
[32]  Lan, Z., Chen, M., Goodman, S., et al. (2019) Albert: A Lite BERT for Self-Supervised Learning of Language Representations.
[33]  He, P., Liu, X., Gao, J., et al. (2020) Deberta: Decoding-Enhanced BERT with Disentangled Attention.
[34]  Hamilton, W., Ying, Z. and Leskovec, J. (2017) Inductive Representation Learning on Large Graphs. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, 4-9 December 2017, 1025-1035.
[35]  Lu, Z., Du, P. and Nie, J. (2020) VGCN-BERT: Augmenting BERT with Graph Embedding for Text Classification. In: Jose, J.M., et al., Eds., Advances in Information Retrieval, Springer International Publishing, Berlin, 369-382.
https://doi.org/10.1007/978-3-030-45439-5_25
[36]  Zhang, J., Zhang, H., Xia, C., et al. (2020) Graph-Bert: Only Attention Is Needed for Learning Graph Representations.
[37]  Wu, F., Souza, A., Zhang, T., et al. (2019) Simplifying Graph Convolutional Networks. International Conference on Machine Learning PMLR, Long Beach, 9-15 June 2019, 6861-6871.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133