%0 Journal Article %T 基于滑动窗口的微博时间线摘要算法<br>Microblog Timeline Summarization Algorithm Based on Sliding Window %A 徐伟 %A 赵斌 %A 吉根林 %J 数据采集与处理 %D 2017 %R 10.16337/j.1004-9037.2017.03.011 %X 时间线摘要是在时间维度上对文本进行内容归纳和概要生成的技术。传统的时间线摘要主要研究诸如新闻之类的长文本,而本文研究微博短文本的时间线摘要问题。由于微博短文本内容特征有限,法仅依靠文本内容生成摘要,本文采用内容覆盖性、时间分布性和传播影响力3种指标评价时间线摘要,并提出了基于滑动窗口的微博时间线摘要算法(Microblog timeline summariaztion based on sliding window, MTSW)。该算法首先利用词项强度和熵来确定代表性词项;然后基于上述3种指标构建出评价时间线摘要的综合评价指标;最后采用滑动窗口的方法,遍历时间轴上的微博消息序列,生成微博时间线摘要。利用真实微博数据集的实验结果表明,MTSW算法生成的时间线摘要可以有效地反映热点事件发展演化的过程。<br>Timeline summarization is the process of creating summaries towards topic information and development over time in natural language processing. Some algorithms are proposed to generate summaries towards long text like news, but seldom focus on timeline summaries of short text like microblog. Here, we propose a microblog timeline summarization based on sliding window (MTSW), which simultaneously incorporates content coverage, temporal distribution and influence to evaluate candidate timeline summaries. In the algorithm, representative terms are selected to represent microblog feature according to intensity of terms and entropy. We build a comprehensive indicator for evaluating the timeline summary based on the above three indicators. Then, we use sliding window to generate microblog timeline summary. Experiments on the real-world event datasets verify the effectiveness of the proposed method. %K 微博摘要 %K 时间线摘要 %K 短文本摘要 %K 事件演化< %K br> %K microblog summary %K timeline summary %K short text summary %K event evolution %U http://sjcj.nuaa.edu.cn/ch/reader/view_abstract.aspx?file_no=201703011&flag=1