Location:
Search - text cluster
Search list
Description: 蛙蛙的中文文本聚类,主要采用k-means算法。wawa s text cluster using C#.
Platform: |
Size: 17173 |
Author: 陈石 |
Hits:
Description: 蛙蛙的中文文本聚类,主要采用k-means算法。wawa s text cluster using C#.-蛙蛙Chinese text clustering, the main use of k-means algorithm. wawa s text cluster using C#.
Platform: |
Size: 16384 |
Author: 陈石 |
Hits:
Description: 包括分解聚类算法和k-均值聚类算法,内有用到的数据文本文件,开发环境Visual Studio .NET2003-Including the decomposition clustering algorithm and k-means clustering algorithm, with useful data to a text file, development environment Visual Studio. NET2003
Platform: |
Size: 1020928 |
Author: 杨洋 |
Hits:
Description: 用TFIDF和特征增益两种方式实现了特征向量空间的建立,将文本文件表示成特征向量的形式,为接下来的聚类做了准备。程序用JAVA写成。-TFIDF and features with two ways to gain the feature vector space to achieve the establishment of the text document into the form of feature vector for the next cluster preparation. Program written using JAVA.
Platform: |
Size: 712704 |
Author: qiusiheng |
Hits:
Description: 一个自然语言处理的Java开源工具包。LingPipe目前已有很丰富的功能,包括主题分类(Top Classification)、命名实体识别(Named Entity Recognition)、词性标注(Part-of Speech Tagging)、句题检测(Sentence Detection)、查询拼写检查(Query Spell Checking)、兴趣短语检测(Interseting Phrase Detection)、聚类(Clustering)、字符语言建模(Character Language Modeling)、医学文献下载/解析/索引(MEDLINE Download, Parsing and Indexing)、数据库文本挖掘(Database Text Mining)、中文分词(Chinese Word Segmentation)、情感分析(Sentiment Analysis)、语言辨别(Language Identification)等API。-A natural language processing of the Java open-source toolkit. LingPipe currently have a lot of useful features, including Subject Classification (Top Classification), Named Entity Recognition (Named Entity Recognition), part of speech tagging (Part-of Speech Tagging), sentence detection problem (Sentence Detection), spell-checking query (Query Spell Checking), interest in the phrase detection (Interseting Phrase Detection), Cluster (Clustering), Character Modeling Language (Character Language Modeling), medical literature to download/analysis/index (MEDLINE Download, Parsing and Indexing), text mining database (Database Text Mining), Chinese word segmentation (Chinese Word Segmentation), emotional analysis (Sentiment Analysis), language identification (Language Identification), such as API.
Platform: |
Size: 4669440 |
Author: 张国栋 |
Hits:
Description: 分布式全文搜索工具包
可以支持集群
主要使用java開發
比較方便使用-Distributed full-text search tool kit can support the main use of cluster development of more convenient use of java
Platform: |
Size: 1104896 |
Author: jiahailu |
Hits:
Description: 最新AP聚类算法以及演示程序,算法内容参照affinity appropagation in science。-AP latest clustering algorithm as well as the demo program, algorithm reference content affinity appropagation in science.
Platform: |
Size: 5120 |
Author: lilan |
Hits:
Description: 用java语言实现文本聚类,包括聚类前的数据预处理:分词、降维、建立向量空间模型等-Implementation using java language text clustering, including clustering of the data pre-processing before: segmentation, dimensionality reduction, set up, such as Vector Space Model
Platform: |
Size: 17408 |
Author: 优优 |
Hits:
Description: 一、问题描述若要在n个城市之间建役通信网络,只福要架设n-1条级路即可.如何以最低的经济代价建设这个通信网,是一个网的最小生成树问题。二、基本要求 (1)利用克鲁斯卡尔算法求图的最小生成树。 (2)能实现教科书6.5节中定义的抽象数据类型MFSet.以此表示构造生成树过程中的连通分量。 (3 ) 以文本形式输出生成树中各条边以及他们的权值.三、需求分析 1、构造图结构。 2、利用克鲁斯卡尔算法求图的最小生成树。 3、完成生成树的输出。 -I. Description of the problem to the n cities built between the service communication networks, Fuk only n-1 to set up the way to class Article. How to minimize the economic cost of building the communications network is a network of minimum spanning tree problem. Second, the basic requirements of (1) the use of Kruskal Algorithm for Minimum Spanning Tree. (2) to achieve 6.5 in the textbook definition of abstract data types MFSet. Spanning tree structure as that in the process of component connectivity. (3) to output text in the spanning tree edges and their weights. Third, a needs analysis, structural map structure. 2, the use of Kruskal Algorithm for Minimum Spanning Tree. 3, complete the spanning tree output.
Platform: |
Size: 684032 |
Author: 赵婧 |
Hits:
Description: 一个相对很完善的数据挖掘系统(少部分功能欠缺),不仅包括了按类Association(内含Apriori,C4.5,GrowTree),Classification(ID3),Cluster划分常用的算法及个人优化后的算法。同时包含了多种数据源(MS Access,Excel,SQL Server,TEXT)的界面直接连接方式。界面方面更像一个专业的系统,采用了类VC浮动多框架界面,也可以作为界面编程的参考。这个工程的分数和其他小工具的分数一样,感觉亏大了:)-A relatively well-developed data mining system (a small number of features are not available), including not only the per-class Association (containing Apriori, C4.5, GrowTree), Classification (ID3), Cluster algorithm for division of common and individual optimized algorithms. Also includes a variety of data sources (MS Access, Excel, SQL Server, TEXT) means direct connection interface. Interfaces more like a professional system that uses a type of floating multi-frame interface, VC can also be programmed as an interface reference. The project scores and scores of other small tools, like, sensory loss big:)
Platform: |
Size: 572416 |
Author: 马志强 |
Hits:
Description: 这是两篇中文的关于聚类算法的精确描述,比较注重多初学者的培养,希望对大家有所帮助,里面涉及的内容不深-This is the two Chinese on the clustering algorithm, precise descriptions pay more attention to the cultivation of many beginners, want to help everyone, which is not related to the contents of the deep
Platform: |
Size: 1268736 |
Author: 刘扬 |
Hits:
Description: 文本聚类算法源码,包含tf.idf计算的实现,采用java语言编写-text cluster algorithm, including the computation of tf.idf ,written by Java
Platform: |
Size: 9216 |
Author: 谭磊 |
Hits:
Description: k最邻近算法,经典的分类算法,绝对有帮助-k-nearest neighbour algorithm,it is a classical algorithm for text cluster
Platform: |
Size: 17408 |
Author: freesunshine |
Hits:
Description: 本书以机器学习与计算统计为主题背景,专门讲述如何挖掘和分析Web上的数据和资源,如何分析用户体验、市场营销、个人品味等诸多信息,并得出有用的结论,通过复杂的算法来从Web网站获取、收集并分析用户的数据和反馈信息,以便创造新的用户价值和商业价值。全书内容翔实,包括协作过滤技术(实现关联产品推荐功能)、集群数据分析(在大规模数据集中发掘相似的数据子集)、搜索引擎核心技术(爬虫、索引、查询引擎、PageRank算法等)、搜索海量信息并进行分析统计得出结论的优化算法、贝叶斯过滤技术(垃圾邮件过滤、文本过滤)、用决策树技术实现预测和决策建模功能、社交网络的信息匹配技术、机器学习和人工智能应用等。-The book and calculation of statistical machine learning as the theme background, specifically on how to dig and analyze data on the Web and resources, how to analyze the user experience, marketing, personal tastes and many other information, and draw useful conclusions, through a complex algorithm from the Web site to access, collect and analyze user data and feedback information in order to create a new user value and business value. Informative book, including the collaborative filtering technology (to achieve related product recommendation function), the cluster data analysis (in large-scale data set to explore a subset of similar data), search engine core technology (reptiles, index, query engine, PageRank algorithm, etc.), Search for massive information and statistical analysis concluded that the optimization algorithms, Bayesian filtering technology (spam filtering, text filtering), with technology forecasting and decision tree modeling, social network information matching techn
Platform: |
Size: 2633728 |
Author: 陈磊 |
Hits:
Description: java编写的数据挖掘方面的代码,里面包含有文本分类,作者身份识别方面的java源码,本人亲自参与编写-java code about data mining;include:text cluster ,authorship identification,
Platform: |
Size: 57344 |
Author: xiao |
Hits:
Description: Web文档聚类系统的设计与实现:数据挖掘;聚类分柝:文本挖掘;预处理;聚类组合;可
视化;欧氏距离-Web Document Clustering Design and Implementation: Data mining Clustering Hierarchical: text mining pretreatment cluster combinations visualization Euclidean distance
Platform: |
Size: 2422784 |
Author: 王三 |
Hits:
Description: 对文本聚类的向量空间模型机制VSM以及Kmeans等讲解比较详细,是研究文本聚类的好书籍-Clustering of text vector space model to explain the mechanism of VSM and Kmeans and other more detailed, study of text clustering is a good book
Platform: |
Size: 2488320 |
Author: zhan |
Hits:
Description: tomcat集群的详细配置,有详细配置方法,带文字说明。-tomcat cluster configure, configuration, with the text.
Platform: |
Size: 19456 |
Author: 坚持到底 |
Hits:
Description: K-Means文本聚类python实现,文本聚类算法,人名排除歧义-Text Cluster by the algorithm of K-means(include texts), discrimination of name ambiguity.
Platform: |
Size: 766976 |
Author: Chris Ma |
Hits:
Description: weka平台的文本分类测试,源代码为java-Text categorization test weka platform, the source code for the java
Platform: |
Size: 277504 |
Author: ziyan |
Hits: