Software Details:
Version: 2.3.1 updated
Upload Date: 9 Feb 16
Distribution Type: Freeware
Downloads: 159
Apache Parquet is a "columnar" data storage format that was specifically created for the Apache Hadoop family of projects.
Parquet is recommended to be used with large data, mainly because it uses a complex data compression system, relying on a series of optimized record shredding and re-assembly algorithms.
This allows data to be broken down, organized in a nested format, and reassembled whenever queried.
The Parquet format can also be used outside the Hadoop ecosystem, being specifically designed to be as agnostic as possible, working with any type of data processing framework and data storage model.
What is new in this release:
- Rename packages and maven coordinates to org.apache
- Add encoding stats to ColumnMetaData
- Streaming thrift API
- New logical types
What is new in version 2.3.0:
- Rename packages and maven coordinates to org.apache
- Add encoding stats to ColumnMetaData
- Streaming thrift API
- New logical types
Limitations:
- The project is still under development in the Apache Incubator repository and might change drastically from version to version.
Comments not found