Apache Hadoop

Software Screenshot:
Apache Hadoop
Software Details:
Version: 2.7.2 updated
Upload Date: 10 Feb 16
Distribution Type: Freeware
Downloads: 562

Rating: nan/5 (Total Votes: 0)

Apache Hadoop was initially developed by Yahoo and the project is a combination between the previous Apache Hadoop Core and Apache Hadoop Common repos.

The Hadoop project has gained a lot of notoriety thanks to its great results in implementing a multi-server distributed computing system for handling huge amounts of data.

The project itself is actually made of four parts. One is Hadoop Common, the so called core that allows all other modules to work, the second is its very own filesystem - HDFS (Hadoop Distributed File System), the third is the Hadoop YARN scheduling framework, and finally there is the Hadoop MapReduce system for supporting parallel computing.

Using these ground breaking system, the Apache has managed to create many other side projects, like:

- Apache Ambari

- Apache Avro

- Apache Cassandra

- Apache HBase

- Apache Hive

- Apache Mahout

- Apache Pig

- Apache Spark

- Apache Tez

- Apache ZooKeeper

All of these rely on Hadoop's powerful data processing engine or its distributed computing features, integrating one of its modules in their core's one way or the other.

What is new in this release:

  • Support for Archival Storage
  • Transparent data at rest encryption (beta)
  • Operating secure DataNode without requiring root access
  • Hot swap drive: support add/remove data node volumes without restarting data node (beta)
  • AES support for faster wire encryption
  • Support for long running services in YARN
  • Support node labels during scheduling
  • Support for time-based resource reservations in Capacity Scheduler (beta)
  • Global, shared cache for application artifacts (beta)
  • Support running of applications natively in Docker containers (alpha)

What is new in version 2.7.1:

  • Support for Archival Storage
  • Transparent data at rest encryption (beta)
  • Operating secure DataNode without requiring root access
  • Hot swap drive: support add/remove data node volumes without restarting data node (beta)
  • AES support for faster wire encryption
  • Support for long running services in YARN
  • Support node labels during scheduling
  • Support for time-based resource reservations in Capacity Scheduler (beta)
  • Global, shared cache for application artifacts (beta)
  • Support running of applications natively in Docker containers (alpha)

What is new in version 2.7.0:

  • Support for Archival Storage
  • Transparent data at rest encryption (beta)
  • Operating secure DataNode without requiring root access
  • Hot swap drive: support add/remove data node volumes without restarting data node (beta)
  • AES support for faster wire encryption
  • Support for long running services in YARN
  • Support node labels during scheduling
  • Support for time-based resource reservations in Capacity Scheduler (beta)
  • Global, shared cache for application artifacts (beta)
  • Support running of applications natively in Docker containers (alpha)

What is new in version 2.6.0:

  • Support for Archival Storage
  • Transparent data at rest encryption (beta)
  • Operating secure DataNode without requiring root access
  • Hot swap drive: support add/remove data node volumes without restarting data node (beta)
  • AES support for faster wire encryption
  • Support for long running services in YARN
  • Support node labels during scheduling
  • Support for time-based resource reservations in Capacity Scheduler (beta)
  • Global, shared cache for application artifacts (beta)
  • Support running of applications natively in Docker containers (alpha)

What is new in version 2.5.0:

  • Authentication improvements when using an HTTP proxy server.
  • A new Hadoop Metrics sink that allows writing directly to Graphite.
  • Specification for Hadoop Compatible Filesystem effort.
  • Support for POSIX-style filesystem extended attributes.
  • OfflineImageViewer to browse an fsimage via the WebHDFS API.
  • Supportability improvements and bug fixes to the NFS gateway.
  • Modernized web UIs (HTML5 and Javascript) for HDFS daemons.
  • YARN's REST APIs support submitting and killing applications.
  • Kerberos integration for the YARN's timeline store.

What is new in version 2.4.0:

  • Support for Access Control Lists in HDFS
  • Native support for Rolling Upgrades in HDFS
  • Usage of protocol-buffers for HDFS FSImage for smooth operational upgrades
  • Complete HTTPS support in HDFS
  • Support for Automatic Failover of the YARN ResourceManager
  • Enhanced support for new applications on YARN with Application History Server and Application Timeline Server
  • Support for strong SLAs in YARN CapacityScheduler via Preemption

What is new in version 2.3.0:

  • Support for Heterogeneous Storage hierarchy in HDFS.
  • In-memory cache for HDFS data with centralized administration and management.
  • Simplified distribution of MapReduce binaries via HDFS in YARN Distributed Cache.

What is new in version 2.2.0:

  • YARN - A general purpose resource management system for Hadoop to allow MapReduce and other other data processing frameworks and services
  • High Availability for HDFS
  • HDFS Federation
  • HDFS Snapshots
  • NFSv3 access to data in HDFS

What is new in version 2.1.0-beta:

  • HDFS Snapshots
  • Support for running Hadoop on Microsoft Windows
  • YARN API stabilization

What is new in version 2.0.3-alpha:

  • QJM for HDFS HA for NameNode
  • Multi-resource scheduling (CPU and memory) for YARN
  • YARN ResourceManager Restart
  • Significant stability at scale for YARN (over 30,000 nodes and 14 million applications so far, at time of release)

What is new in version 1.0.0:

  • Better security.
  • HBase (append/hsynch/hflush, and security).
  • webhdfs (with full support for security).
  • Performance enhanced access to local files for HBase.
  • Other performance enhancements, bug fixes, and features.

What is new in version 0.20.2:

  • RPC Server send buffer retains size of largest response ever sent.
  • C++ libraries do not build on Debian Lenny.
  • Some c++ scripts are not chmodded before ant execution.
  • Streaming: process provided status messages are overwritten every 10 seoncds.
  • IllegalArgumentException when CombineFileInputFormat is used as job InputFormat.
  • Multiple bugs w/ Hadoop archives.
  • Allow caching of filesystem instances to be disabled on a per-instance basis.
  • Missing synchronization for defaultResources in Configuration.addResource.
  • GzipCodec should not represent BuiltInZlibInflater as decompressorType.
  • NameNode's HttpServer can't instantiate InetSocketAddress: IllegalArgumentException is thrown.
  • HttpServer sleeps with negative values.
  • Namenode runs of out of memory due to memory leak in ipc Server.
  • IPC client bug may cause rpc call hang.
  • Failing tests prevent the rest of test targets from execution.
  • Contrib tests are failing Clover'ed build.
  • Tests do not run on 0.20 branch.
  • TestStreamingStatus is failing on 0.20 branch.

Similar Software

BASSCSS
BASSCSS

10 Feb 16

Packer
Packer

10 Dec 15

Zebra_Tooltips
Zebra_Tooltips

10 Feb 16

Other Software of Developer Apache Software Foundation

Apache JDO
Apache JDO

11 Apr 15

Apache JMeter
Apache JMeter

12 Apr 15

Apache cTAKES
Apache cTAKES

20 Jul 15

Apache Solr
Apache Solr

10 Dec 15

Comments to Apache Hadoop

Comments not found
Add Comment
Turn on images!