Apache Pig

Software Screenshot:
Apache Pig
Software Details:
Version: 0.15.0 updated
Upload Date: 20 Jul 15
Distribution Type: Freeware
Downloads: 148

Rating: 5.0/5 (Total Votes: 1)

Apache Pig has spawned from the Apache Hadoop project and is one of its modules that where in charge of providing a way to analyze the data it processed and stored.

Pig uses a custom query language called "Pig Latin" which is incredibly easy to learn and supports both relational and functional styles.

This means you can use it as a classic SQL language benefiting from data joins and filters, or you can use its MapReduce features, the data mappers and reducers.

By default Apache Pig was meant to be used inside Hadoop installations, but newer versions allow it to run separately via a separate JVM.

What is new in this release:

  • Pluggable execution engines (to allow pig run on non-mapreduce engines in future)
  • Auto-local mode (to jobs with small input data size to run in-process)
  • Fetch optimization (to improve interactiveness of grunt)
  • Fixed counters for local-mode
  • Support for user level jar cache
  • Support for blacklisting and whitelisting pig commands
  • Several performance fixes and debuggability features
  • A few non-backwards compatible interface modifications have been introduced in this release to make pig work with non-mapreduce engines

What is new in version 0.14.0:

  • Pluggable execution engines (to allow pig run on non-mapreduce engines in future)
  • Auto-local mode (to jobs with small input data size to run in-process)
  • Fetch optimization (to improve interactiveness of grunt)
  • Fixed counters for local-mode
  • Support for user level jar cache
  • Support for blacklisting and whitelisting pig commands
  • Several performance fixes and debuggability features
  • A few non-backwards compatible interface modifications have been introduced in this release to make pig work with non-mapreduce engines

What is new in version 0.11.0:

  • This release includes DateType datatype, RANK, CUBE and ROLLUP operators, Groovy udfs, custom reducer estimation, schema-based tuples and HCatalog DDL integration.

What is new in version 0.9.1:

  • This release works with Hadoop 0.20.

What is new in version 0.6:

  • Added Zebra as a contrib project. See http://wiki.apache.org/pig/zebra
  • Added UDFContext, gives UDFs a way to pass info from front to back end and gives UDFS access to JobConf in the backend.
  • Added left outer join for fragment replicate join.
  • Added ability to set job priority from Pig Latin.
  • Enhanced multi-query to work with joins in some cases.
  • Reworked memory manager to significantly reduce GC Overhead and Out of Heap failures.
  • Added Accumulator interface for UDFs.
  • Over 100 bug fixes and improvements.

Requirements:

  • Java 1.6.x or higher
  • Apache Hadoop 0.20.x or higher

Similar Software

Baker
Baker

1 Mar 15

jSmart
jSmart

13 May 15

Analytica
Analytica

12 May 15

Other Software of Developer Apache Software Foundation

Apache Tomcat
Apache Tomcat

20 Jul 15

Apache Qpid
Apache Qpid

12 Apr 15

Apache Cocoon
Apache Cocoon

5 Jun 15

Comments to Apache Pig

Comments not found
Add Comment
Turn on images!