Description: 用JAVA编写的,在做实验的时候留下来的,本来想删的,但是传上来,大家分享吧-prepared with JAVA, in the course of experiments to the left, originally wanted to cut, but onto Chuan, share it Platform: |
Size: 19097600 |
Author:Elaine |
Hits:
Description: heritrix是一种开源的网络爬虫/网络蜘蛛,heritrix目的是能够跟踪页面的url进行扩展的抓取,最后为搜索引擎提供广泛的数据来源。-heritrix is an open source network reptiles/Web Spiders, heritrix purpose is to track the page url to the expansion of the crawl, and finally for the search engine provides a wide range of data sources. Platform: |
Size: 9784320 |
Author:傅志诚 |
Hits:
Description: 高性能分词算法,采用java实现,能自动进行最小分词,用户可以筛选分词类别-Word segmentation algorithm for high-performance, the realization of the use of java, can automatically carry out the smallest sub-word, the user can filter category segmentation Platform: |
Size: 10551296 |
Author:lijianfei |
Hits: