Apache Parquet

Software Screenshot:
Apache Parquet
Software Details:
Version: 2.3.1 updated
Upload Date: 9 Feb 16
Distribution Type: Freeware
Downloads: 159

Rating: 3.0/5 (Total Votes: 1)

Apache Parquet is a "columnar" data storage format that was specifically created for the Apache Hadoop family of projects.

Parquet is recommended to be used with large data, mainly because it uses a complex data compression system, relying on a series of optimized record shredding and re-assembly algorithms.

This allows data to be broken down, organized in a nested format, and reassembled whenever queried.

The Parquet format can also be used outside the Hadoop ecosystem, being specifically designed to be as agnostic as possible, working with any type of data processing framework and data storage model.

What is new in this release:

  • Rename packages and maven coordinates to org.apache
  • Add encoding stats to ColumnMetaData
  • Streaming thrift API
  • New logical types

What is new in version 2.3.0:

  • Rename packages and maven coordinates to org.apache
  • Add encoding stats to ColumnMetaData
  • Streaming thrift API
  • New logical types

Limitations:

  • The project is still under development in the Apache Incubator repository and might change drastically from version to version.

Similar Software

Apache Empire-db
Apache Empire-db

10 Dec 15

Prom
Prom

5 Sep 16

db.js
db.js

13 Apr 15

Other Software of Developer Apache Software Foundation

Apache Helix
Apache Helix

13 Apr 15

Apache Axiom
Apache Axiom

6 Mar 16

Apache log4net
Apache log4net

9 Feb 16

Apache Tapestry
Apache Tapestry

9 Feb 16

Comments to Apache Parquet

Comments not found
Add Comment
Turn on images!