Description: 利用最大匹配法进行汉语句子的分词 最大匹配算法是最常用的分词算法,简单实用正确率可达到80%以上-the maximum matching method for the Chinese Sentence Word maximum matching algorithm is the most commonly used word segmentation algorithm, simple and practical accuracy rate can reach more than 80% Platform: |
Size: 73728 |
Author:廖剑 |
Hits:
Description: 分词,针对汉语的分词,根据统计来实现的,可以直接使用目录即可,里面针对联合早报进行的测试,分次统计中可以包括任意目录(系统能承受得了就行),这是帮一个同学做的作业:)用asp。net + xml-Segmentation for Chinese word segmentation, according to statistics to be achieved, direct access to the directory can be, which for Lianhe test, sub-sub-statistics can include arbitrary directory (the system can accept the deregulation on the line), which is to help a fellow student to do the operation:) with asp. net+ xml Platform: |
Size: 43008 |
Author: |
Hits:
Description: 基于动态规划的中文分词程序,用vc写的,便于扩展。-based on dynamic programming of the Chinese word segmentation procedures using vc write, easy expansion. Platform: |
Size: 59392 |
Author:何伟 |
Hits:
Description: 一套分词算法.
也是我的毕业设计的原型.
该分词算法,使分词速度达到2~8万/秒.而且没有使用线程.如果使用线程,速度可达几十万每秒以上-A word segmentation algorithm. Also my graduation design prototype. The word segmentation algorithm, so that segmentation speeds of up to 2 ~ 80,000/sec. And do not use the thread. If you are using threads, speed can reach hundreds of thousands per second above Platform: |
Size: 59392 |
Author:王平 |
Hits:
Description: 实现中文分词,词汇表记录在WORD.TXT文件中。待分词文档为article.txt。-The realization of Chinese word segmentation, glossary WORD.TXT recorded in the file. Question word document for article.txt. Platform: |
Size: 209920 |
Author:冯翔 |
Hits:
Description: 使用browser,将word按照页码切分,最后每一页作为一个文件保存-The use of browser, the word segmentation in accordance with the page number, and finally each page as a file is saved Platform: |
Size: 2048 |
Author:kangyan |
Hits:
Description: 自己编写的中文分词源程序,用vc++编写,附有完整的文档,以及标准的分词数据库-I have written the source code of the Chinese word segmentation, using vc++ to prepare, with complete documentation, as well as sub-standard speech database Platform: |
Size: 8994816 |
Author:tanyi |
Hits:
Description: 实现了中文分词和词性标注程序。分词方法采用“三词正向最长匹配”。词性标注使用HMM方法,用Viterbi算法实现。“三词正向最长匹配”保持了“正向最长匹配算法”快速的特点,同时提高了分词的准确性。-Chinese word segmentation and implemented procedures for POS tagging. Segmentation Methods, " the longest three-match positive words." POS tagging using HMM method, the Viterbi algorithm. " Three words maximum positive match" to maintain a " positive maximum matching algorithm," Fast features, while improving the accuracy of segmentation. Platform: |
Size: 4034560 |
Author:paul |
Hits:
Description: 中文分词研究的相关中文资料,主要是中科院的几个大牛的论文,对初学者很有帮助-Chinese word segmentation of the relevant Chinese data, mainly Chinese Academy of Sciences of the several large cattle paper, useful for beginners Platform: |
Size: 1764352 |
Author:王子豪 |
Hits:
Description: 一款很实用的中文分词软件,通过php语言实现.-A very practical Chinese word segmentation software through php language. Platform: |
Size: 677888 |
Author:吴华 |
Hits:
Description: 实现一个中文自动分词程序,所使用的编程语言不限
选作:对人名,地名,机构名的识别
下载北大计算语言所标注的99年人民日报分词语料库,构建一个词表
实现正向、逆向最大分词算法-To implement a Chinese automatic word segmentation procedure, used by any programming language
Chosen for: the person names, place names, organization name recognition
Download calculation language annotation of Peking University in 99 the People s Daily participle corpus, to build a glossary
Achieve forward, reverse maximum segmentation algorithm Platform: |
Size: 425984 |
Author:黄艳玲 |
Hits:
Description: 很好的中文分词算法,详细介绍请解压后看注释。字典文件也要放在目录下。-Good Chinese word segmentation algorithm, detailed look after unzip comment. But also on the dictionary file directory. Platform: |
Size: 1876992 |
Author:zyt |
Hits:
Description: 使用扫描方法对粘连在一起的文字进行分割处理,最后识别出来-Using the scanning method to stick together word segmentation, and finally identified Platform: |
Size: 81920 |
Author:宰柯楠 |
Hits: