OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Journal of Intelligent Learning Systems and Applications 2018

A Sentence Similarity Estimation Method Based on Improved Siamese Network

DOI: 10.4236/jilsa.2018.104008, PP. 121-134

Ziming Chi, Bingyan Zhang

Keywords: Sentence Similarity, Sentence Modeling, Similarity Measurement, Attention Mechanism, Fully-Connected Layer, Disorder Sentence Dataset

Full-Text Cite this paper Add to My Lib

Abstract:

In this paper we employ an improved Siamese neural network to assess the semantic similarity between sentences. Our model implements the function of inputting two sentences to obtain the similarity score. We design our model based on the Siamese network using deep Long Short-Term Memory (LSTM) Network. And we add the special attention mechanism to let the model give different words different attention while modeling sentences. The fully-connected layer is proposed to measure the complex sentence representations. Our results show that the accuracy is better than the baseline in 2016. Furthermore, it is showed that the model has the ability to model the sequence order, distribute reasonable attention and extract meanings of a sentence in different dimensions.

References

[1]	Marelli, M., Bentivogli, L., Baroni, M., Bernardi, R., Menini, S. and Zamparelli, R. (2014) Semeval-2014 Task 1: Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment. Proceedings of the 8th International Workshop on Semantic Evaluation, Dublin, 23-24 August 2014, 1-8.
[2]	Mueller, J. and Thyagarajan, A. (2016) Siamese Recurrent Architectures for Learning Sentence Similarity. AAAI, 16, 2786-2792.
[3]	Hochreiter, S. and Schmidhuber, J. (1997) Long Short-Term Memory. Neural Computation, 9, 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
[4]	Rocktäschel, T., Grefenstette, E., Hermann, K.M., Kočisky, T. and Blunsom, P. (2015) Reasoning about Entailment with Neural Attention. arXivpreprint arXiv: 1509.06664.
[5]	Baziotis, C., Pelekis, N. and Doulkeridis, C. (2017) Datastories at Semeval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison. Proceedings of the 11th International Workshop on Semantic Evaluation, Vancouver, 3-4 August 2017, 390-395.
[6]	Yih, W.T., Chang, M.W., Meek, C. and Pastusiak, A. (2013) Question Answering Using Enhanced Lexical Semantic Models. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 1, 1744-1753.
[7]	Huang, X. and Hu, Q. (2009) A Bayesian Learning Approach to Promoting Diversity in Ranking for Biomedical Information Retrieval. Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, Boston, 19-23 July 2009, 307-314.
[8]	Wang, Y., Hu, Q., Song, Y. and He, L. (2017) Potentiality of Healthcare Big Data: Improving Search by Automatic Query Reformulation. 2017 IEEE International Conference on Big Data, Boston, 11-14 December 2017, 807-816. https://doi.org/10.1109/BigData.2017.8257996
[9]	Wan, S., Dras, M., Dale, R. and Paris, C. (2006) Using Dependency-Based Features to Take the “Para-Farce” out of Paraphrase. Proceedings of the Australasian Language Technology Workshop 2006, Sydney, 30 November-1 December 2006, 131-138.
[10]	Ji, Y. and Eisenstein, J. (2013) Discriminative Improvements to Distributional Sentence Similarity. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, 18-21 October 2013, 891-896.
[11]	Dolan, B., Quirk, C. and Brockett, C. (2004) Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources. Proceedings of the 20th international conference on Computational Linguistics, Geneva, 23-27 August 2004, 350.
[12]	Heilman, M. and Smith, N.A. (2010) Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions. Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Los Angeles, 2 June 2010, 1011-1019.
[13]	Neculoiu, P., Versteegh, M. and Rotaru, M. (2016) Learning Text Similarity with Siamese Recurrent Networks. Proceedings of the 1st Workshop on Representation Learning for NLP, Berlin, 11 August 2016, 148-157. https://doi.org/10.18653/v1/W16-1617
[14]	He, H., Gimpel, K. and Lin, J. (2015) Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, 17-21 September 2015, 1576-1586. https://doi.org/10.18653/v1/D15-1181
[15]	Kiros, R., Zhu, Y., Salakhutdinov, R.R., Zemel, R., Urtasun, R., Torralba, A. and Fidler, S. (2015) Skip-Thought Vectors. Advances in Neural Information Processing Systems, Montreal, 7-12 December 2015, 3294-3302.
[16]	Tai, K.S., Socher, R. and Manning, C.D. (2015) Improved Semantic Representations from Tree-Structured Long Short-Term Memory Networks. https://arxiv.org/abs/1503.00075
[17]	Zhao, Z., Lu, H., Zheng, V.W., Cai, D., He, X. and Zhuang, Y. (2017) Community-Based Question Answering via Asymmetric Multi-Faceted Ranking Network Learning. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, California, USA, 4-9 February 2017, 3532-3539.
[18]	Fang, H., Wu, F., Zhao, Z., Duan, X., Zhuang, Y. and Ester, M. (2016) Community-Based Question Answering via Heterogeneous Social Network Learning. Proceedings of the Thirty AAAI Conference on Artificial Intelligence, Phoenix, Arizona, USA, 12-17 February 2016.
[19]	Bahdanau, D., Cho, K. and Bengio, Y. (2014) Neural Machine Translation by Jointly Learning to Align and Translate. https://arxiv.org/abs/1409.0473
[20]	Luong, M.T., Pham, H. and Manning, C.D. (2015) Effective Approaches to Attention-Based Neural Machine Translation. https://arxiv.org/abs/1508.04025
[21]	Li, J., Luong, M.T. and Jurafsky, D. (2015) A Hierarchical Neural Autoencoder for Paragraphs and Documents. https://arxiv.org/abs/1506.01057
[22]	Rush, A.M., Chopra, S. and Weston, J. (2015) A Neural Attention Model for Abstractive Sentence Summarization. https://arxiv.org/abs/1509.00685
[23]	See, A., Liu, P.J. and Manning, C.D. (2017) Get to the Point: Summarization with Pointer-Generator Networks. https://arxiv.org/abs/1704.04368
[24]	Paulus, R., Xiong, C. and Socher, R. (2017) A Deep Reinforced Model for Abstractive Summarization. https://arxiv.org/abs/1705.04304
[25]	Zhang, X., Li, S., Sha, L. and Wang, H. (2017) Attentive Interactive Neural Networks for Answer Selection in Community Question Answering. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, California, USA, 4-9 February 2017, 3525-3531.
[26]	Tan, M., Santos, C.D., Xiang, B. and Zhou, B. (2015) LSTM-Based Deep Learning Models for Non-Factoid Answer Selection. https://arxiv.org/abs/1511.04108
[27]	Santos, C.D., Tan, M., Xiang, B. and Zhou, B. (2016) Attentive Pooling Networks. https://arxiv.org/abs/1602.03609
[28]	Chen, Q., Hu, Q., Huang, J.X. and He, L. (2018) CA-RNN: Using Context-Aligned Recurrent Neural Networks for Modeling Sentence Similarity. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2-7 February 2018.
[29]	Chen, Q., Hu, Q., Huang, J.X. and He, L. (2018) CAN: Enhancing Sentence Similarity Modeling with Collaborative and Adversarial Network. Proceedings of 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, Ann Arbor, MI, USA, 8-12 July 2018, 815-824.
[30]	Bromley, J., Guyon, I., Le Cun, Y., Säckinger, E. and Shah, R. (1994) Signature Verification Using a “Siamese” Time Delay Neural Network. Advances in Neural Information Processing Systems, Denver, Colorado, 29 November-2 December 1993, 737-744.
[31]	Pennington, J., Socher, R. and Manning, C. (2014) Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25-29 October 2014, 1532-1543. https://doi.org/10.3115/v1/D14-1162
[32]	Yang, Z., Yang, D., Dyer, C., He, X., Smola, A. and Hovy, E. (2016) Hierarchical Attention Networks for Document Classification. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, 12-17 June 2016, 1480-1489.
[33]	Kingma, D.P. and Ba, J. (2014) Adam: A Method for Stochastic Optimization. https://arxiv.org/abs/1412.6980
[34]	Bergstra, J., Yamins, D. and Cox, D.D. (2013) Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. Proceedings of the 30th International Conference on Machine Learning, Atlanta, Georgia, USA, 16-21 June 2013.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133