|
- 2013
基于组合验证的Web页面抽取算法研究
|
Abstract:
通过研究抽取算法的本质和抽取算法之间的关系,对抽取算法的互补性进行分析,提出了一种多算法组合验证机制,该机制能检测出抽取算法的错误,并通过结合动态阈值调整的方法,提高抽取算法的抽取准确率.
The nature of universal web-information retrieval algorithm has been investigated,and a frame of cross-validation mechanism which could detect failure of the retrieval process has been proposed.After then,the performance by dynamically adjust threshold value of each algorithm has been improved