A Two-stage Crawler for Efficiently Harvesting Deep-Web Interfaces

Publication Date : 31/03/2016

Volume/Issue :
Volume 2
Issue 3
(03 - 2016)

As deep web gains at a very fast pace, there has been increased interest in techniques that help efficiently locate deep-web intermix. However, due to the large volume of web resources and the dynamic nature of deep web, achieving wide report and high efficiency is a challenging issue. In existing system they proposed a two-stage framework, specially Smart Crawler, for efficient harvesting deep web interfaces. To achieve more accurate results for a focused crawl, Smart Crawler grade websites to prioritize highly relevant ones for a given topic. In proposed system the multi-key word search concept will be used ,the system will be giving all the possible relevant links. This will be achieved in two ways 1st)The query which is submitted to the application will be preprocessed, after pre-processing only root words will be taken and it will find Synonym, Hypernym and Hyponym and it will listed to the user so this is the reason that all possible links can be found related to search. If any words in that displayed list is selected then all the website links, images and news feeds will be given as final output to the user. Then the book mark concept is included that is the book marked link will be added to the application directly not to the browser so the bookmarked content will visible globally .

