%0 Journal Article %T 基于句子语义距离的释义识别研究 %A 黄江平 %A 姬东鸿 %J 工程科学与技术 %D 2016 %X 中文摘要: 针对释义识别任务如何学习上下文语义的问题,提出了利用词向量来表示句子语义距离的模型。首先,利用word2vec训练大规模的词向量模型,把词的语义信息利用向量分布式表示;然后通过欧氏距离来计算句子间词的移动开销;最后基于EMD模型实现了从词语义距离到句子语义距离的建模,通过采用句子变换矩阵来实现句子间语义距离的度量,进而从语义相似性方面进行句子释义识别。实验基于SemEval-2015 PIT任务,与作为实验基线的逻辑回归和加权矩阵因数分解方法进行比较,提出的模型采用有监督实验时, 值非常接近实验基线,而采用监督方法实验时, 值提高了5.8%。</br>Abstract:To learn the context semantic information of word for paraphrase identification, the model for representing sentence semantic distances based on word embeddings was proposed for paraphrase detection tasks. Firstly, a large-scale word vectors was trained with word2vec model, which embedded the semantic information in word distributional representation. Then, the travel cost between words in sentences computed with Euclidean distance in the word2vec embedding space. Finally, the model from word embeddings to sentence distances was built based on EMD, and sentence transportation matrix was presented for distance metric between sentences. The sentence semantic distances were used for paraphrase recognition. Experiments based on SemEval-2015 PIT Task showed that the proposed model approximates to the baseline in supervised method and gives an improvement of 5.8% in unsupervised methods, compared to the weighted matrix factorization. %K 释义识别 词向量 句子语义距离 推特< %K /br> %K paraphrase identification word vector sentence semantic distances twitter %U http://jsuese.ijournals.cn/jsuese_cn/ch/reader/view_abstract.aspx?file_no=201600219&flag=1