|
计算机科学 2010
Improvement of Web Information Extraction Algorithm Based on
|
Abstract:
With the development of the Internet technologies,the information on the Internet increases exponentially. One important research focuses on how to extract structured data from these great capacities of online documents in unstructured texts.This thesis mainly studied relative algorithms on Web information extraction based on hidden Markov model(HMM),discussed how to use HMM and how to mark data in text information extraction,offered several methods to improve the hidden Markov model in information extracti...