Description: 利用最大匹配法进行汉语句子的分词 最大匹配算法是最常用的分词算法,简单实用正确率可达到80%以上-the maximum matching method for the Chinese Sentence Word maximum matching algorithm is the most commonly used word segmentation algorithm, simple and practical accuracy rate can reach more than 80% Platform: |
Size: 73728 |
Author:廖剑 |
Hits:
Description: 分词,针对汉语的分词,根据统计来实现的,可以直接使用目录即可,里面针对联合早报进行的测试,分次统计中可以包括任意目录(系统能承受得了就行),这是帮一个同学做的作业:)用asp。net + xml-Segmentation for Chinese word segmentation, according to statistics to be achieved, direct access to the directory can be, which for Lianhe test, sub-sub-statistics can include arbitrary directory (the system can accept the deregulation on the line), which is to help a fellow student to do the operation:) with asp. net+ xml Platform: |
Size: 43008 |
Author: |
Hits:
Description: 基于动态规划的中文分词程序,用vc写的,便于扩展。-based on dynamic programming of the Chinese word segmentation procedures using vc write, easy expansion. Platform: |
Size: 59392 |
Author:何伟 |
Hits:
Description: 实现中文分词,词汇表记录在WORD.TXT文件中。待分词文档为article.txt。-The realization of Chinese word segmentation, glossary WORD.TXT recorded in the file. Question word document for article.txt. Platform: |
Size: 209920 |
Author:冯翔 |
Hits:
Description: 中文分词工具,利用lucence的接口写的,进行最长匹配,正向和反向匹配后根据词数选择。别人写的,我用了,觉得不错,简单,上手快-Chinese word segmentation tool lucence interface written for the longest match, the forward and reverse to match the number under the word choice. Someone else wrote, I used the think that a good, simple, on手快 Platform: |
Size: 868352 |
Author:xielang |
Hits:
Description: 汉语词法分析系统ICTCLAS(Institute of Computing Technology, Chinese Lexical Analysis System),该系统的功能有:中文分词;词性标注;未登录词识别。分词正确率高达97.58 (973专家组评测),未登录词识别召回率均高于90 ,其中中国人名的识别召回率接近98 处理速度为31.5Kbytes/s。ICTCLAS的特色还在于:可以根据需要输出多个高概率结果,有多种输出格式,支持北大词性标注集,973专家组给出的词性标注集合。该系统得到了专家的好评,并有多篇论文在国内外发表。-Chinese Lexical Analysis System ICTCLAS (Institute of Computing Technology, Chinese Lexical Analysis System), the system s features are: Chinese word segmentation-of-speech tagging unknown word identification. The correct segmentation rate of 97.58 (973 expert group evaluation), identification of unknown word recall rate were higher than 90 , one of the Chinese names to identify the recall rate of nearly 98 rate of treatment 31.5Kbytes/s. Also features ICTCLAS is: can output a number of high probability of a result, have a variety of output formats, to support the Beijing University of-speech tagging sets, 973 expert group give a collection of-speech tagging. The system has been praised by experts, and a number of papers published at home and abroad. Platform: |
Size: 4434944 |
Author:lwl |
Hits:
Description: imdict-chinese-analyzer 是 imdict智能词典 的智能中文分词模块,算法基于隐马尔科夫模型(Hidden Markov Model, HMM),是中国科学院计算技术研究所的ictclas中文分词程序的重新实现(基于Java),可以直接为lucene搜索引擎提供简体中文分词支持。-imdict-chinese-analyzer is a smart imdict Chinese Dictionary smart module segmentation algorithm based on Hidden Markov Model (Hidden Markov Model, HMM), the Chinese Academy of Sciences Institute of Computing Technology of Chinese word segmentation ictclas process re-implement (based on Java ), can be directly provided for the lucene search engine support for Simplified Chinese word segmentation. Platform: |
Size: 3256320 |
Author:王同 |
Hits:
Description: 自己编写的中文分词源程序,用vc++编写,附有完整的文档,以及标准的分词数据库-I have written the source code of the Chinese word segmentation, using vc++ to prepare, with complete documentation, as well as sub-standard speech database Platform: |
Size: 8994816 |
Author:tanyi |
Hits:
Description: 实现了中文分词和词性标注程序。分词方法采用“三词正向最长匹配”。词性标注使用HMM方法,用Viterbi算法实现。“三词正向最长匹配”保持了“正向最长匹配算法”快速的特点,同时提高了分词的准确性。-Chinese word segmentation and implemented procedures for POS tagging. Segmentation Methods, " the longest three-match positive words." POS tagging using HMM method, the Viterbi algorithm. " Three words maximum positive match" to maintain a " positive maximum matching algorithm," Fast features, while improving the accuracy of segmentation. Platform: |
Size: 4034560 |
Author:paul |
Hits:
Description: 中文分词研究的相关中文资料,主要是中科院的几个大牛的论文,对初学者很有帮助-Chinese word segmentation of the relevant Chinese data, mainly Chinese Academy of Sciences of the several large cattle paper, useful for beginners Platform: |
Size: 1764352 |
Author:王子豪 |
Hits:
Description: 一款很实用的中文分词软件,通过php语言实现.-A very practical Chinese word segmentation software through php language. Platform: |
Size: 677888 |
Author:吴华 |
Hits:
Description: 中文分词以及具有简单界面中文分词系统,使用双向匹配算法,并可选择算法不同进行分词-Chinese word segmentation, and has a simple interface Chinese word segmentation system Platform: |
Size: 1393664 |
Author:wyp |
Hits:
Description: 该程序为在MFC下开发的正向和反向两种中文分词系统。-The program was developed in MFC under both positive and negative Chinese word segmentation system. Platform: |
Size: 2258944 |
Author:Sam Stevent |
Hits:
Description: 很好的中文分词算法,详细介绍请解压后看注释。字典文件也要放在目录下。-Good Chinese word segmentation algorithm, detailed look after unzip comment. But also on the dictionary file directory. Platform: |
Size: 1876992 |
Author:zyt |
Hits: