%0 Journal Article %T How to Get the Main Part of Web Pages
基于标记树的Web页面区域划分和搜索方法 %A HU Fei %A
胡飞 %J 计算机科学 %D 2005 %I %X A Web page can be divided into several parts, they are "the main part, the department logo, the navigation bar, the hyperlinks and the copyright". How to get the main part of Web pages. It's easy for humankind, but hard for computer pocessing. In this paper we tackle the problem by exploring a tag tree, which can suitably express the struc- ture and the layout of Web pages. Here we propose a method to build the tag tree, in addition to develop a single path tag tree named tag tree model, which only describe the main part of Web pages. %K Web page layout %K Web page structure %K Web page area %K Tag tree %K Tag tree model
Web页面布局 %K 页面结构 %K 页面区域 %K 标记树 %K 标记树模式 %U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=64A12D73428C8B8DBFB978D04DFEB3C1&aid=0EA1A5BC316E6341&yid=2DD7160C83D0ACED&vid=9971A5E270697F23&iid=5D311CA918CA9A03&sid=B1F98368A47B8888&eid=D46BA3D3D4B3C585&journal_id=1002-137X&journal_name=计算机科学&referenced_num=4&reference_num=7