Apache Crunch

Apache Crunch 0.13.0 updated

A pipeline is a concatenation of operations to perform a specific job, arranged so that the output of each element is the input of the next.Apache Crunch provides an easier method of dealing with Apache Hadoop MapReduce pipelines.Crunch simplifies this...

Apache Slider

Apache Slider 0.80.0 updated

Apache Slider targets Hadoop environments and is based on the database's next generation MapReduce 2.0 standard, also known as YARN. Slider can be used to create YARN-compliant applications that interact with the underlying Hadoop database or database...

Apache OpenJPA

Apache OpenJPA 2.4.0 / 1.2.3 updated

Apache OpenJPA comes in two separate branches, each production level material.The differences between the two are the standards they implement. The 1.x branch follows the JSR-220 Enterprise Java Beans 3.0 specificationm while the 2.x branch was modeled...

Apache Accumulo

Apache Accumulo 1.7.0 updated

Apache Accumulo is a mashup of various technologies, from Google's BigTable, to Apache's Hadoop, Thrift and Zookeeper.Compared to Google's BigTable system, Accumulo features a few improvements of its own.These include table cell-based access restrictions,...

Apache MRUnit lets developers write unit tests that can be used in detecting problems with MapReduce jobs before their run on the database itself.By unit testing Hadoop's MapReduce jobs, developer can avoid useless resource consumption, a good habit to...

Apache Sqoop

Apache Sqoop 1.4.6 / 1.99.6 updated

Apache Sqoop is a must-have tool for every database administrator, letting them easily move data between the Hadoop NoSQL database to more classic database systems like PostgreSQL, MSSQL, MariaDB, or MySQL - a.k.a. relational databases.Sqoop basically...

Apache CouchDB was initially developed at IBM and donated later on to the Apache Software Foundation.Compared to other databases around, CouchDB is still very young, but this has not stopped it from gathering quite a following in its short lifespan.The...