Apache Hive

Software Screenshot:
Apache Hive
Software Details:
Version: 1.2.1 updated
Upload Date: 21 Jul 15
Distribution Type: Freeware
Downloads: 30

Rating: nan/5 (Total Votes: 0)

Apache Hive was first developed as a Apache Hadoop sub-project for providing Hadoop administrators with an easy to use, proficient query language for their data.

Because of this, Hive was developed from the start to work with huge amounts of information for each query and is perfectly adapted for large scale databases and business environments.

Tools are included for easily loading, extracting and transforming the data, while custom data structures can be forced upon a wide set of data formats.

Since it's an Hadoop-related project the HDFS and HBase projects are also automatically supported.

HiveQL is probably the best part of the project, providing a simple, innovative and efficient query language, while also being able to handle mappers and reducers in situations where the native SQL syntax can't get the desired data.

What is new in this release:

  • Support uncorrelated subqueries in the WHERE clause
  • Add NULL DEFINED AS to ROW FORMAT specification
  • Ccreate/drop database should populate inputs/outputs and check concurrency and user permission
  • Support specifying scale and precision with Hive decimal type
  • Let there be Tez
  • An explode function that includes the item's position in the array
  • Add char data type
  • Create collect UDF and make evaluator reusable
  • Extend record writer and ORC reader/writer interfaces to provide statistics
  • Implement statistics providing ORC writer and reader interfaces
  • Annotate hive operator tree with statistics from metastore
  • Provide stripe level column statistics in ORC
  • Subquery support: disallow nesting of SubQueries
  • Subquery support: allow subquery expressions in having clause
  • Subquery support: more tests
  • Native Parquet Support in Hive
  • Hive should be able to skip header and footer rows when reading data file for a table
  • Add DATE, TIMESTAMP, DECIMAL, CHAR, VARCHAR types support in HCat
  • Use map-join hint to cache intermediate result
  • Add UDF to calculate distance between geographic coordinates

What is new in version 1.2.0:

  • Support uncorrelated subqueries in the WHERE clause
  • Add NULL DEFINED AS to ROW FORMAT specification
  • Ccreate/drop database should populate inputs/outputs and check concurrency and user permission
  • Support specifying scale and precision with Hive decimal type
  • Let there be Tez
  • An explode function that includes the item's position in the array
  • Add char data type
  • Create collect UDF and make evaluator reusable
  • Extend record writer and ORC reader/writer interfaces to provide statistics
  • Implement statistics providing ORC writer and reader interfaces
  • Annotate hive operator tree with statistics from metastore
  • Provide stripe level column statistics in ORC
  • Subquery support: disallow nesting of SubQueries
  • Subquery support: allow subquery expressions in having clause
  • Subquery support: more tests
  • Native Parquet Support in Hive
  • Hive should be able to skip header and footer rows when reading data file for a table
  • Add DATE, TIMESTAMP, DECIMAL, CHAR, VARCHAR types support in HCat
  • Use map-join hint to cache intermediate result
  • Add UDF to calculate distance between geographic coordinates

What is new in version 1.1.0:

  • Support uncorrelated subqueries in the WHERE clause
  • Add NULL DEFINED AS to ROW FORMAT specification
  • Ccreate/drop database should populate inputs/outputs and check concurrency and user permission
  • Support specifying scale and precision with Hive decimal type
  • Let there be Tez
  • An explode function that includes the item's position in the array
  • Add char data type
  • Create collect UDF and make evaluator reusable
  • Extend record writer and ORC reader/writer interfaces to provide statistics
  • Implement statistics providing ORC writer and reader interfaces
  • Annotate hive operator tree with statistics from metastore
  • Provide stripe level column statistics in ORC
  • Subquery support: disallow nesting of SubQueries
  • Subquery support: allow subquery expressions in having clause
  • Subquery support: more tests
  • Native Parquet Support in Hive
  • Hive should be able to skip header and footer rows when reading data file for a table
  • Add DATE, TIMESTAMP, DECIMAL, CHAR, VARCHAR types support in HCat
  • Use map-join hint to cache intermediate result
  • Add UDF to calculate distance between geographic coordinates

What is new in version 1.0.0:

  • Support uncorrelated subqueries in the WHERE clause
  • Add NULL DEFINED AS to ROW FORMAT specification
  • Ccreate/drop database should populate inputs/outputs and check concurrency and user permission
  • Support specifying scale and precision with Hive decimal type
  • Let there be Tez
  • An explode function that includes the item's position in the array
  • Add char data type
  • Create collect UDF and make evaluator reusable
  • Extend record writer and ORC reader/writer interfaces to provide statistics
  • Implement statistics providing ORC writer and reader interfaces
  • Annotate hive operator tree with statistics from metastore
  • Provide stripe level column statistics in ORC
  • Subquery support: disallow nesting of SubQueries
  • Subquery support: allow subquery expressions in having clause
  • Subquery support: more tests
  • Native Parquet Support in Hive
  • Hive should be able to skip header and footer rows when reading data file for a table
  • Add DATE, TIMESTAMP, DECIMAL, CHAR, VARCHAR types support in HCat
  • Use map-join hint to cache intermediate result
  • Add UDF to calculate distance between geographic coordinates

What is new in version 0.8.1:

  • Tools to enable easy data extract/transform/load (ETL).
  • A mechanism to impose structure on a variety of data formats.
  • Access to files stored either directly in Apache HDFS (TM) or in other data storage systems such as Apache HBase (TM).
  • Query execution via MapReduce.

What is new in version 0.8.0:

  • Tools to enable easy data extract/transform/load (ETL)
  • A mechanism to impose structure on a variety of data formats
  • Access to files stored either directly in Apache HDFS (TM) or in other data storage systems such as Apache HBase (TM)
  • Query execution via MapReduce

What is new in version 0.7.1:

  • Bugs:
  • Exception on windows when using the jdbc driver. "IOException: The system cannot find the path specified".
  • Schema creation scripts are incomplete since they leave out tables that are specific to DataNucleus.
  • Improvements:
  • Improve miscellaneous error messages.
  • Return correct Major / Minor version numbers for JDBC Hive Driver.
  • Add the HivePreparedStatement implementation based on current HIVE supported data-type.
  • Tasks:
  • Hive in Maven.
  • Provide Metastore upgrade scripts and default schemas for PostgreSQL.

What is new in version 0.7.0:

  • New Feature:
  • Authorization infrastructure for Hive
  • Implement Indexing in Hive
  • Add reflect() UDF for reflective invocation of Java methods
  • Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)
  • Implement GenericUDF str_to_map
  • Patch to support HAVING clause in Hive
  • Track the joins which are being converted to map-join automatically
  • Call frequency and duration metrics for HiveMetaStore via jmx
  • Maintain lastAccessTime in the metastore
  • Improvement:
  • Provide option to export a HEADER
  • Support for distinct selection on two or more columns
  • Describe extended table/partition output is cryptic
  • Missing some Jdbc functionality like getTables, getColumns and HiveResultSet.get* methods based on column name.
  • Tapping logs from child processes
  • Support filter pushdown against non-native tables
  • Replace dependencies on HBase deprecated API
  • Add queryid while locking
  • Update transident_lastDdlTime only if not specified
  • Add more debug information for hive locking
  • HiveInputFormat or CombineHiveInputFormat always sync blocks of RCFile twice
  • Show the time the local task takes
  • Create a new ZooKeeper instance when retrying lock, and more info for debug
  • Add a option to run task to check map-join possibility in non-local mode
  • More debugging for locking
  • Add an option in dynamic partition inserts to throw an error if 0 partitions are created
  • Bugs:
  • "LOAD DATA LOCAL INPATH" fails when the table already contains a file of the same name
  • NULL is not handled correctly in join
  • HiveInputFormat.getInputFormatFromCache "swallows" cause exception when throwing IOExcpetion
  • Add progress in join and groupby
  • Simple UDAFs with more than 1 parameter crash on empty row query
  • UDF field() doesn't work
  • Dynamic partition inserts left empty files uncleaned in Hadoop 0.17 local mode
  • Skip counter update when RunningJob.getCounters() returns null

What is new in version 0.5.0:

  • Let user specify serde for custom scripts.
  • Add UDF unhex.
  • Remove lzocodec import from FileSinkOperator.
  • Driver NullPointerException when calling getResults without first compiling.
  • Performance improvement for RCFile and ColumnarSerDe in Hive.

Similar Software

Topsy
Topsy

6 Jun 15

JBoss Remoting
JBoss Remoting

13 May 15

gspreadsheet
gspreadsheet

13 May 15

Parker
Parker

10 Dec 15

Other Software of Developer Apache Software Foundation

Comments to Apache Hive

Comments not found
Add Comment
Turn on images!