China Internet Network Information Center. The 32nd Statistical Report on Internet Development in China[EB/OL]. [ 2013-07-17]. http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201307/t20130717_40664.htm (in Chinese)(中国互联网络信息中心.第32次中国互联网络发展状况统计报告[EB/OL]. [ 2013-07-17]. http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201307/t20130717_40664.htm)
[2]
Pretzsch S, Muthmann K, Schil A. FODEX-Towards Generic Data Extraction from Web Forums // Proc of the 26th International Conference on Advanced Information Networking and Applications. Fukuoka, Japan, 2012: 821-826
[3]
Liu W, Yan H L, Xiao J G. Automatically Extracting User Reviews from Forum Sites. Computers and Mathematics with Applications, 2011, 62(7): 2779-2792
[4]
Liu J, Song X Y, Jiang J T, et al. An Unsupervised Method for Author Extraction from Web Pages Containing User-Generated Content // Proc of the 21st ACM International Conference on Information and Knowledge Management. Maui, USA, 2012: 2387-2390
[5]
Song X Y, Liu J, Cao Y B, et al. Automatic Extraction of Web Data Records Containing User-Generated Content // Proc of the 19th ACM International Conference on Information and Knowledge Management. Toronto, Canada, 2010: 39-48
[6]
Yang J M, Cai R, Wang Y D, et al. Incorporating Site-Level Knowledge to Extract Structured Data from Web Forums // Proc of the 18th International Conference on World Wide Web. Madrid, Spain, 2009: 181-190
[7]
Van der Meer J, Frasincar F. Automatic Review Identification on the Web Using Pattern Recognition. Software: Practice and Experience, 2013, 43(12): 1415-1436
[8]
Yin X X, Tan W Z, Li X, et al. Automatic Extraction of Clickable Structured Web Contents for Name Entity Queries // Proc of the 19th International Conference on World Wide Web . Raleigh, USA, 2010: 991-1000
[9]
Hong J L, Tan E X, Fauzi F. Data Extraction for Search Engine Using Safe Matching // Proc of the 24th Australasian Joint Conference on Artificial Intelligence. Perth, Australia, 2011: 759-768
[10]
Zhao H K, Meng W Y, Wu Z H, et al. Fully Automatic Wrapper Generation for Search Engines // Proc of the 14th International Conference on World Wide Web . Chiba, Japan, 2005: 66-75
[11]
Hong J L, Siew E G, Egerton S. WMS-Extracting Multiple Sections Data Records from Search Engine Results Pages // Proc of the ACM Symposium on Applied Computing. Sierre, Switzerland, 2010: 1696-1701
[12]
Liu B, Grossman R, Zhai Y H. Mining Data Records in Web Pages // Proc of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington, USA, 2003: 601-606
[13]
Miao G X, Tatemura J C, Hsiung W P, et al. Extracting Data Records from the Web Using Tag Path Clustering // Proc of the 18th International Conference on World Wide Web. Madrid, Spain, 2009: 981-990
[14]
Wang Y, Li B C, Lin C. Data Extraction from Web Forums Based on Similarity of Page Layout. Journal of Chinese Information Processing, 2010, 24(2): 68-75 (in Chinese)(王 允,李弼程,林 琛.基于网页布局相似度的Web论坛数据抽取.中文信息学报, 2010, 24(2): 68-75)
[15]
Yamada Y, Craswell N, Nakatoh T, et al. Testbed for Information Extraction from Deep Web // Proc of the 13th International Conference on World Wide Web. New York, USA, 2004: 346-347