|
- 2018
基于字面和语义相关性匹配的智能篇章排序
|
Abstract:
摘要: 提出了一种基于字面相关性匹配和语义相关性匹配的深度神经网络模型,用来计算信息检索中查询和文档的匹配得分。字面相关性匹配模型基于查询和文档之间的词共现矩阵,主要考虑查询和文档的字面匹配信息以及匹配词的位置信息;语义相关性匹配模型基于预训练的词向量,进一步通过卷积神经网络提取查询和文档之间不同位置的语义匹配信息,最后的匹配得分是这两个子模型的叠加。损失函数采用hinge loss,通过最大化正负样本之间的分数差来更新参数。实验结果表明,模型在验证集上的NDCG@3和NDCG@5分别可以达到0.790 4和0.818 3,相对于BM25以及单个的字面匹配或者语义匹配模型来说都有很大的提升,这也验证了字面匹配和语义匹配对于信息检索的重要性。
Abstract: A deep neural network based on lexical correlation matching and semantic correlation matching is proposed, which can be used to calculate the matching score of a query and a document in the information retrieval task. The lexical relevance matching model is based upon the word co-occurrence matrix of a query and a document, which takes the word matching information into consideration, so as to consider the position information of the matching word. The semantic relevance matching model is grounded in pre-trained word vector, then the convolution network extracts the semantic matching information between a query and different positions of the documents, where the final matching score is the superposition of the two sub-models. Model parameters are updated in the training process by maximizing the fractional difference between positive and negative samples. Experimental results indicate that the NDCG@3 and NDCG@5 of the model can attain to 0.790 4 and 0.818 3 respectively on the validation set. which significantly outperforms the baselines, verifying the importance of word and semantic matching for information retrieval
[1] | NALISNICK E, MITRA B, CRASWELL N, et al. Improving document ranking with dual word embeddings[C] // Proceedings of the 25th International Conference Companion on World Wide Web. Sweden: International World Wide Web Conferences Steering Committee, 2016: 83-84. |
[2] | PONTE J M, CROFT W B. A language modeling approach to information retrieval[C] // Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. New York: ACM, 1998: 275-281. |
[3] | MITRA B, CRASWELL N. Neural models for information retrieval[J]. arXiv Preprint, 2017, arXiv: 1705.01509. |
[4] | ZHENG G, CALLAN J. Learning to reweight terms with distributed representations[C] // Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2015: 575-584. |
[5] | PANG L, LAN Y, GUO J, et al. Text matching as image recognition[C] // Thirtieth AAAI Conference on Artificial Intelligence. Menlo Park, CA: AAAI. 2016: 2793-2799. |
[6] | PANG L, LAN Y, GUO J, et al. A study of matchpyramid models on ad-hoc retrieval[J]. arXiv Preprint, 2016, arXiv:1606.04648. |
[7] | RAO J, HE H, LIN J. Noise-contrastive estimation for answer selection with deep neural networks[C] // Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. New York: ACM, 2016: 1913-1916. |
[8] | MCCLELLAND J L, RUMELHART D E, PDP Research Group. Parallel distributed processing[M]. Cambridge, MA: MIT Press, 1987. |
[9] | ROBERTSON S, ZARAGOZA H. The probabilistic relevance framework: BM25 and beyond[J]. Foundations & Trends<sup>?</sup> in Information Retrieval, 2009, 3(4):333-389. |
[10] | HUANG P S, HE X, GAO J, et al. Learning deep structured semantic models for web search using clickthrough data[C] // Proceedings of the 22nd ACM international conference on Conference on information & knowledge management. New York: ACM, 2013: 2333-2338. |
[11] | MITRA B, DIAZ F, CRASWELL N. Learning to match using local and distributed representations of text for web search[C] // Proceedings of the 26th International Conference on World Wide Web. Sweden: International World Wide Web Conferences Steering Committee, 2017: 1291-1299. |
[12] | MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[J]. Advances in Neural Information Processing Systems, 2013, 26:3111-3119. |
[13] | DIAZ F, MITRA B, CRASWELL N. Query expansion with locally-trained word embeddings[J]. arXiv Preprint, 2016, arXiv:1605.07891. |
[14] | SHEN Y, HE X, GAO J, et al. A latent semantic model with convolutional-pooling structure for information retrieval[C] // Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. New York: ACM, 2014: 101-110. |
[15] | CARUANA R, LAWRENCE S, GILES C L. Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping[C] // Advances in neural information processing systems. Cambridge: MIT Press, 2001: 402-408. |
[16] | GUOJ, FAN Y, AI Q, et al. A deep relevance matching model for ad-hoc retrieval[C] // Proceedings of the 25th ACM International on Conference on; Information and Knowledge Management. New York: ACM, 2016: 55-64. |
[17] | SEO M, KEMBHAVI A, Farhadi A, et al. Bidirectional attention flow for machine comprehension[J]. arXiv Preprint, 2016, arXiv: 1611.01603. |