%0 Journal Article %T 面向垂直搜索基于本体的可读性计算模型<br>An ontology-based readability model for vertical search %A 张文雅 %A 宋大为 %A 张鹏< %A br> %A ZHANG Wen-ya %A SONG Da-wei %A ZHANG Peng %J 山东大学学报(理学版) %D 2016 %R 10.6040/j.issn.1671-9352.1.2015.069 %X 摘要: 作为一项新兴的信息检索评价指标,可读性在文档相关性、实用性以及质量评估中占据重要地位。其中,如何为用户提供相关可读的文档已成为垂直搜索领域一个亟待解决的问题。为了有效解决这个问题,提出了一种基于本体结构的可读性计算模型。该模型以用户的阅读抽象过程为背景,分别从语篇表面层次和概念层次对文本进行可读性计算,从而引入了3个可读性指标,即概念势、概念域和文档连贯性。具体地是将单个指标或者指标组合计算所得可读性得分融入传统垂直检索模型中,对文档初次检索结果进行重排。在医学领域中,用户实验结果表明基于本体概念序列信息的可读性指标相对于传统的非序列化指标可以更加有效地预测文档的真实可读性水平。系统实验结果进一步说明了基于可读性的重排序模型可以兼顾文档的相关性和可读性,提升垂直领域信息检索性能。<br>Abstract: As an emerging evaluation criteria of information retrieval(IR), readability plays an important role in accessing documents relevance, utility and quality. How to provide different users with relevant and readable documents has been an urgent problem in vertical search. In order to solve this problem, we propose a new ontology-based readability method. Based on users’ reading process, we measure documents readability from surface and conceptual levels. In this model, three readability indicator shave been introduced, i.e., Concept Topography, Concept Scope and Document Coherence. Specifically, the readability of a document that computed by individual or combined indicators can be used to re-rank the initial lists of documents which are returned by a conventional search engine. In medical domain, the user-oriented evaluations show that our model has good correlation with humans’ judgments in readability prediction. And our model is also competitive compared with one of the state-of-the-artreadability models in system-orient edevaluation %K 特定领域信息检索 %K 文档重排 %K 可读性 %K < %K br> %K readability %K documents re-ranking %K vertical search %U http://lxbwk.njournal.sdu.edu.cn/CN/10.6040/j.issn.1671-9352.1.2015.069