Description: 聚类分析技术有着广泛应用.因为在对图像进行聚类分析时,通常缺少可资利用的先验知识,所以需要采用无监督的聚类算法.为了适应图像检索的需要,提出了一种新型的无监督聚类方法,即根据离群点信息来自动确定聚类算法的终止时机.此方法还弥补了现有聚类算法在离群点识别、使用上的缺欠.为验证其可行性,用其改进了CURE和ROCK两个经典算法.实验表明,改进后的两个算法都能自动终止,并能取得优于以往的聚类效果. -clustering analysis techniques have wide application. In the image clustering analysis, usually available to the lack of a priori knowledge, Therefore, the need for unsupervised clustering algorithm. To meet the needs of image retrieval, propose a novel unsupervised clustering method, That is, according to information outliers automatically clustering algorithm to determine the time of termination. This method also makes up for in the existing clustering algorithm outliers identification, the use of the shortcomings. To test its feasibility. use its improved CURE ROCK and two classical algorithm. Experiments show that The two improved algorithm can automatically terminated and can be made better than the previous clustering effect. Platform: |
Size: 1024 |
Author: |
Hits:
Description: Many of the pattern fi nding algorithms such as decision tree, classifi cation rules and clustering
techniques that are frequently used in data mining have been developed in machine learning
research community. Frequent pattern and association rule mining is one of the few excep-
tions to this tradition. The introduction of this technique boosted data mining research and its
impact is tremendous. The algorithm is quite simple and easy to implement. Experimenting
with Apriori-like algorithm is the fi rst thing that data miners try to do. Platform: |
Size: 132096 |
Author:鱼彬彬 |
Hits:
Description: state of art language modeling methods:
An Empirical Study of Smoothing Techniques for Language Modeling.pdf
BLEU, a Method for Automatic Evaluation of Machine Translation.pdf
Class-based n-gram models of natural language.pdf
Distributed Language Modeling for N-best List Re-ranking.pdf
Distributed Word Clustering for Large Scale Class-Based Language Modeling in.pdf
-state of art language modeling methods: An Empirical Study of Smoothing Techniques for Language Modeling.pdfBLEU, a Method for Automatic Evaluation of Machine Translation.pdfClass-based n-gram models of natural language.pdfDistributed Language Modeling for N-best List Re-ranking . pdfDistributed Word Clustering for Large Scale Class-Based Language Modeling in.pdf Platform: |
Size: 2016256 |
Author:wen6860 |
Hits:
Description: 6中AI算法,包括路径规划与移动技术、有限状态机,脚本技术,群聚技术,遗传算法,神经网络,每个算法中都有例子程序。-6 AI algorithms, including path planning and mobile technology, finite state machines, scripting technologies, clustering techniques, genetic algorithms, neural networks, each algorithm, there are examples of procedures. Platform: |
Size: 2953216 |
Author:北平 |
Hits:
Description: 本文描述VC开发平台下实现的技巧,通过学习这个可以熟练掌握VC开发的编程技巧-This article describes the development platform for VC achieve the skills can be mastered by learning the programming techniques developed in VC Platform: |
Size: 81920 |
Author:jiangjiang |
Hits:
Description: 文本聚类是基于相似性算法的自动聚类技术,自动对大量无类别的文档进行归类,把内容相近的文档归为一类,并自动为该类生成特征主题词。适用于自动生成热点舆论专题、重大新闻事件追踪、情报的可视化分析等诸多应用。
灵玖Lingjoin(www.lingjoin.com)基于核心特征发现技术,突破了传统聚类方法空间消耗大,处理时间长的瓶颈;不仅聚类速度快,而且准确率高,内存消耗小,特别适合于超大规模的语料聚类和短文本的语料聚类。
灵玖文档聚类组件的主要特色在于:
1、速度快:可以处理海量规模的网络文本数据,平均每小时处理至少50万篇文档;
2、聚类精准:Top N的聚类中心往往能反映出当时的时事热点,适合于舆情热点计算;与国际上以聚类见长的Autonomy公司技术相比,灵玖的各项指标远远领先,或许是灵玖更懂中文吧
3、精准排序:各个类别按照影响权重排序,每个类中的文档按照重要性排序;
4、可定制:可以定制类别数、类别中心。
5、开放式接口:灵玖文档聚类组件作为LJParser的一部分,采用灵活的开发接口,可以方便地融入到用户的业务系统中,可以支持各种操作系统,各类调用语言。
灵玖文档聚类可以应用于文本挖掘、知识管理、搜索聚类、舆情监测等多种应用中。
-Text clustering algorithm is based on the similarity of automatic clustering techniques, automatically a large number of non-classified categories of documents, the contents of the documents fall into a similar category, and automatically generate the features for this kind of keywords. For automatic generation of hot topics of public opinion, major news event tracking, information visualization analysis and many other applications.
Ling Jiu Lingjoin (www.lingjoin.com) found that based on the core features of technology, a breakthrough of traditional clustering method of space consumption, processing time is long bottlenecks not only the clustering speed and high accuracy, memory consumption is small, is particularly suitable for ultra-large-scale corpus clustering and short text corpus clustering.
Ling-Jiu document clustering component of the main features are:
1, fast: the size of the network can handle the massive text data, the average hourly processing at least 50 mil Platform: |
Size: 1100800 |
Author:lingjoin |
Hits:
Description: Content-based medical image retrieval is now getting more and more attention in the
world, a feasible and efficient retrieving algorithm for clinical endoscopic images is urgently
required. Methods: Based on the study of single feature image retrieving techniques, including color
clustering, color texture and shape, a new retrieving method with multi-features fusion and relevance
feedback is proposed to retrieve the desired endoscopic images. Results: A prototype system is set
up to evaluate the proposed method’s performance and some evaluating parameters such as the
retrieval precision & recall, statistical average position of top 5 most similar image on various features, etc.
are therefore given. Conclusions: The algorithm with multi-features fusion and relevance feedback
gets more accurate and quicker retrieving capability than the one with single feature image retrieving
technique due to its flexible feature combination and interactive relevance feedback. Platform: |
Size: 359424 |
Author:gokul/goks |
Hits:
Description: This book was written for anyone who wants to implement data clustering algorithms and for those who want to implement new data clustering algorithms in a better way. Using object-oriented design and programming techniques, I have exploited the commonalities of all data clustering algorithms to create a flexible set of reusable classes that simplifies the implementation of any data clustering algorithm. Readers can follow me through the development of the base data clustering classes and several popular data clustering algorithms. Platform: |
Size: 3153920 |
Author:numa |
Hits:
Description: 数据拟合与分群方法于强健语音特征提取之研究-Exploring the Use of Data Fitting and Clustering Techniques for Robust Speech Recognition Platform: |
Size: 749568 |
Author:ll |
Hits:
Description: This paper presents an analytical study on different clustering techniques used for image segmentation Platform: |
Size: 355328 |
Author:Bora |
Hits:
Description: 外国人写的数据聚类综述:近邻,模糊聚类 ,神经网络,数据挖掘应用 图像处理应用-Clustering is the unsupervised classification of patterns (observations, data items,
or feature vectors) into groups (clusters). The clustering problem has been
addressed in many contexts and by researchers in many disciplines this reflects its
broad appeal and usefulness as one of the steps in exploratory data analysis.
However, clustering is a difficult problem combinatorially, and differences in
assumptions and contexts in different communities has made the transfer of useful
generic concepts and methodologies slow to occur. This paper presents an overview
of pattern clustering methods a statistical pattern recognition perspective,
with a goal of providing useful advice and references to fundamental concepts
accessible to the broad community of clustering practitioners. We present a
taxonomy of clustering techniques, and identify cross-cutting themes and recent
advances. We also describe some important applications of clustering algorithms
such as image segmentation, o Platform: |
Size: 565248 |
Author:shenaimin |
Hits: