%0 Journal Article
%T Hot Word Extraction for Microblog Based on Massive Data Filtering
基于海量信息过滤的微博热词抽取方法
%A WANG Yang
%A SHUAI Jian-Mei
%A CHEN Zhi-Gang
%A
汪洋
%A 帅建梅
%A 陈志刚
%J 计算机系统应用
%D 2012
%I
%X This paper presents a Chinese microblog hot words extraction algorithm based on massive data Filtering. Firstly, it chooses the user behaviour characteristics and text characteristics to create user behavior models, and filters massive data to create topic-trees by a fast algorithm based on rules. Then, it uses hot words extraction algorithm to get the hot topic of topic-trees by word frequency feature. The experiment results show that the proposed algorithm can reduce the scale of the input data, with keeping lots of important information to extract hot words.
%K Chinese microblog
%K user behavior models
%K massive data filtering
%K hot word extraction
%K power law distribution
中文微博
%K 用户行为模型
%K 海量信息过滤
%K 热词抽取
%K 幂律分布
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=D4F6864C950C88FFCE5B6C948A639E39&aid=2F008F04A30914BA30AB23AD406F8BE6&yid=99E9153A83D4CB11&vid=659D3B06EBF534A7&iid=708DD6B15D2464E8&sid=6DE26652A1045643&eid=58F693790F887B3B&journal_id=1003-3254&journal_name=计算机系统应用&referenced_num=0&reference_num=8