Description: There are more and more people are keen on doing web crawler (spider), there are more and more places require network reptiles, such as search engines, information gathering, monitoring public opinion and so on and so forth. Web crawler technology involved (algorithm/strategy) wide and complex, such as web access, web tracking, web analytics, web searching, page rank and structure/unstructured data extraction and the latter a more fine-grained data mining and other aspects, for novice, is not able to fully grasp overnight and skilled application, which focuses on one of the six ways
To Search:
File list (Check if you may need any files):
CrawlerTest
...........\src
...........\...\cn
...........\...\..\ysh
...........\...\..\...\studio
...........\...\..\...\......\crawler
...........\...\..\...\......\.......\htmlunit
...........\...\..\...\......\.......\........\HtmlUnitSpider.java
...........\...\..\...\......\.......\httpclient
...........\...\..\...\......\.......\..........\HttpClientTest.java
...........\...\..\...\......\.......\ie
...........\...\..\...\......\.......\..\WatijTest.java
...........\...\..\...\......\.......\jsoup
...........\...\..\...\......\.......\.....\JsoupTest.java
...........\...\..\...\......\.......\selenium
...........\...\..\...\......\.......\........\BaseTest.java
...........\...\..\...\......\.......\........\HtmlDriverTest.java
...........\...\..\...\......\.......\webspec
...........\...\..\...\......\.......\.......\WebspecTest.java