DataFu

Software Screenshot:
DataFu
Software Details:
Version: 1.2.0 / 1.3.0-rc1 updated
Upload Date: 10 Feb 16
Developer: LinkedIn
Distribution Type: Freeware
Downloads: 79

Rating: 5.0/5 (Total Votes: 1)

DataFu was developed at LinkedIn and is written entirely in Java.

DataFu includes functions/libraries for working with:

- Statistics

- Estimation

- Sampling

- Sessions

- Link Analysis

- Set operations

- Bags

DataFu is perfect for data mining and statistical applications working on top of Hadoop or Pig databases.

These functions allow developers to take full advantage of all the data stored inside a Hadoop or Pig database without having to deal with massive system requirements in order to do so.

What is new in this release:

  • Pair of UDFs for simple random sampling with replacement.
  • More dependencies now packaged in DataFu so fewer JAR dependencies required.
  • SetDifference UDF for computing set difference (e.g. A-B or A-B-C).

What is new in version 1.2.0:

  • Pair of UDFs for simple random sampling with replacement.
  • More dependencies now packaged in DataFu so fewer JAR dependencies required.
  • SetDifference UDF for computing set difference (e.g. A-B or A-B-C).

What is new in version 1.1.0:

  • Added SHA hash UDF.
  • InUDF and AssertUDF added for Pig 0.12 compatibility. These are the same as In and Assert.
  • SimpleRandomSample, which implements a scalable simple random sampling algorithm.

Similar Software

nodegit
nodegit

6 Mar 16

Wicket-Bootstrap
Wicket-Bootstrap

10 Dec 15

BrowniePHP
BrowniePHP

21 Jul 15

Jammit
Jammit

11 Mar 16

Other Software of Developer LinkedIn

Comments to DataFu

Comments not found
Add Comment
Turn on images!