Description: 用JAVA编写的,在做实验的时候留下来的,本来想删的,但是传上来,大家分享吧-prepared with JAVA, in the course of experiments to the left, originally wanted to cut, but onto Chuan, share it Platform: |
Size: 19097600 |
Author:Elaine |
Hits:
Description: 搜索引擎,使用Lucene2.0+Heritrix构建了自己的搜索引擎,在eclipse上实现-Search engine, the use of Lucene2.0+ Heritrix build its own search engine, to achieve in eclipse Platform: |
Size: 5620736 |
Author:nick |
Hits:
Description: heritrix是一种开源的网络爬虫/网络蜘蛛,heritrix目的是能够跟踪页面的url进行扩展的抓取,最后为搜索引擎提供广泛的数据来源。-heritrix is an open source network reptiles/Web Spiders, heritrix purpose is to track the page url to the expansion of the crawl, and finally for the search engine provides a wide range of data sources. Platform: |
Size: 9784320 |
Author:傅志诚 |
Hits:
Description: web 网络爬虫 用户可以使用它从网络上抓取想要得资源,开发者还可以扩展它的各个组件,来实现自己的抓取逻辑。-Reptile web network users can use it from the network you want to crawl resources, developers can also extend its various components, to achieve their own logic crawl. Platform: |
Size: 19386368 |
Author:echoli |
Hits:
Description: heritrix-1.14.2-src是网络爬虫Heritrix最新版本的源码,希望对大家有帮助-heritrix-1.14.2-src is a network of reptiles Heritrix the latest version of source, in the hope that we have to help Platform: |
Size: 10543104 |
Author: |
Hits:
Description: 高性能分词算法,采用java实现,能自动进行最小分词,用户可以筛选分词类别-Word segmentation algorithm for high-performance, the realization of the use of java, can automatically carry out the smallest sub-word, the user can filter category segmentation Platform: |
Size: 10551296 |
Author:lijianfei |
Hits:
Description: 这是一个很好的网络爬虫,很适合一般的搜索引擎!-This is a good web crawler, it is suitable for general search engines! Platform: |
Size: 10612736 |
Author:dudu |
Hits: