Description: Hadoop是一个用于运行应用程序在大型集群的廉价硬件设备上的框架。Hadoop为应用程序透明的提供了一组稳定/可靠的接口和数据运动。在 Hadoop中实现了Google的MapReduce算法,它能够把应用程序分割成许多很小的工作单元,每个单元可以在任何集群节点上执行或重复执行。此外,Hadoop还提供一个分布式文件系统用来在各个计算节点上存储数据,并提供了对数据读写的高吞吐率。由于应用了map/reduce和分布式文件系统使得Hadoop框架具有高容错性,它会自动处理失败节点。已经在具有600个节点的集群测试过Hadoop框架。- Apache Hadoop Core is a software platform that lets one easily write and run applications that process vast amounts of data.
Here s what makes Hadoop especially useful:
* Scalable: Hadoop can reliably store and process petabytes.
* Economical: It distributes the data and processing across clusters of commonly available computers. These clusters can number into the thousands of nodes.
* Efficient: By distributing the data, Hadoop can process it in parallel on the nodes where the data is located. This makes it extremely rapid.
* Reliable: Hadoop automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures.
Hadoop implements MapReduce, using the Hadoop Distributed File System (HDFS) (see figure below.) MapReduce divides applications into many small blocks of work. HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster. MapReduce can then process the data w Platform: |
Size: 3598336 |
Author:宾利金 |
Hits:
Description: Hadoop 是一个实现了 MapReduce 计算模型的开源分布式并行编程框架,借助于 Hadoop, 程序员可以轻松地编写分布式并行程序,将其运行于计算机集群上,完成海量数据的计算。-Hadoop is an implementation of the MapReduce computation model of the open-source framework for distributed parallel programming, through the use of Hadoop, programmers can easily distributed parallel process to prepare its running on a computer cluster to complete the calculation of mass data. Platform: |
Size: 42814464 |
Author:xq |
Hits:
Description: Hadoop got its start in Nutch. A few of us were attempting to build an open source web search engine and having trouble managing computations running on even a
handful of computers.-Hadoop got its start in Nutch. A few of us were attempting to build an open source web search engine and having trouble managing computations running on even a handful of computers. Platform: |
Size: 3488768 |
Author:方桭枯 |
Hits:
Description: 基于Hadoop的反向索引的生成工具。输入一系列文本文件,输出word和其出现的文档和位置-Hadoop-based reverse index generation tool. Enter a series of text files, the output word and its emergence and location of the document Platform: |
Size: 14336 |
Author:李明 |
Hits:
Description: Hadoop: The Definitive Guide-Hadoop: The Definitive Guide helps you harness the power of your data. Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop’s data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in th Platform: |
Size: 3571712 |
Author:欧曜玮 |
Hits:
Description: 应用hadoop mapreduce模型,在集群中查找最大值,-Application hadoop mapreduce model to find maximum value in the cluster, Platform: |
Size: 14336 |
Author:付新 |
Hits:
Description: hadoop in action 该书是hadoop的一本详细讲解书籍,详细介绍了mapreduce内容-hadoop hadoop in action book is a detailed explanation of the book, detailing the mapreduce content Platform: |
Size: 3997696 |
Author:陈震 |
Hits:
Description: 自己对hadoop的源码进行分析,解释hadoop mapreduce 在云计算中是如何进行操作。-Hadoop own source code analysis, interpretation hadoop mapreduce in the cloud is how to operate Platform: |
Size: 989184 |
Author:毛福林 |
Hits:
Description: This a demo of Implementation of C4.5 Algorithm using Hadoop MapReduce frame work.
c4.5 using java-This is a demo of Implementation of C4.5 Algorithm using Hadoop MapReduce frame work.
c4.5 using java Platform: |
Size: 5120 |
Author:obra |
Hits:
Description: 本书从Hadoop的缘起开始,由浅入深,结合理论和实践,全方位地介绍Hadoop这一高性能处理海量数据集的理想工具。全书共16章,3个附录,涉及的主题包括:Hadoop简介 MapReduce简介 Hadoop分布式文件系统 Hadoop的I/O、MapReduce应用程序开发;MapReduce的工作机 MapReduce的类型和格式 MapReduce的特性 如何构建Hadoop集群,如何管理Hadoop Pig简介:HBase简介;Hive简介:ZooKeeper简介;开源工具Sqoop,最后还提供了丰富的案例分析。-This book starts the origin of Hadoop the shallower to the deeper, combination of theory and practice, all-round introduce the Hadoop that a ideal tool can process massive data sets . The book has 16 chapters and 3 appendices, topics covered include: Hadoop MapReduce Hadoop distributed file system the development of I/O, MapReduce Hadoop application MapReduce working machine the type and format of MapReduce MapReduce characteristics how to construct a Hadoop cluster, how to manage the Hadoop Pig Description: HBase introduction Hive: ZooKeeper Sqoop open source tools, and finally provide a rich analysis of the case. Platform: |
Size: 23098368 |
Author:xinyue |
Hits:
Description: 在hadoop云平台实现单表关联、多变关联(Implementation of single table Association and variable association on Hadoop cloud platform) Platform: |
Size: 44032 |
Author:瑾love |
Hits: