%0 Journal Article
%T Multi-view canonical correlation analysis based Web spam detection
基于多视图典型相关分析的垃圾网页检测
%A GAO Shuang
%A ZHANG Hua-xiang
%A FANG Xiao-nan
%A
高 爽
%A 张化祥
%A 房晓南
%J 计算机应用研究
%D 2013
%I
%X Firstly this paper divided the features of Web spam pages into the content feature based view and the link feature based view. And it employed canonical correlation analysis and promotion methods for feature extraction to generate two new feature sets for each Web page. Then it implemented different combinations of the two new feature sets of Web pages to produce a single view for Web pages, which used to construct classification algorithms. Experimental results show that considering Web page data as two view data and applying multi-view canonical correlation analysis techniques can effectively improve the recognition accuracy of Web spam.
%K 垃圾网页检测
%K 典型相关分析
%K 多视图分类
%K 特征抽取
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=A9D9BE08CDC44144BE8B5685705D3AED&aid=57319E4289F4438F10B9703366AAC253&yid=FF7AA908D58E97FA&vid=340AC2BF8E7AB4FD&iid=38B194292C032A66&sid=4A2C67480A6B9F95&eid=FCB110411B6339D8&journal_id=1001-3695&journal_name=计算机应用研究&referenced_num=0&reference_num=11