|
Deep Webpage Classification and Extraction (DWCE)Keywords: Deep Web , Search query interface , Query processors , Crawler. Abstract: As the Deep web (or Hidden web) information is hidden behind the search query forms, this information can only be accessed by interacting with these forms. Therefore, development of automated system that interacts with the search forms and extracts the hidden web pages would be of great value to human users. To accomplishthis task stated above, this paper proposes a novel method “Deep Webpage Classification and Extraction” which classifies the websites into appropriate domain, extracts their query interfaces and retrieves all result pages of deep websites using query building system.
|