Hadoopy

Software Screenshot:
Hadoopy
Software Details:
Version: 0.6.0
Upload Date: 12 May 15
Developer: Brandyn White
Distribution Type: Freeware
Downloads: 58

Rating: 3.0/5 (Total Votes: 2)

It is basically a Python library for MapReduce written in Cython.

Features:

  • Similar interface as the Hadoop API (design patterns usable between Python/Java interfaces)
  • General compatibility with dumbo to allow users to switch back and forth
  • Usable on Hadoop clusters without Python or admin access
  • Fast conversion and processing
  • Stay small and well documented
  • Be transparent with what is going on
  • Handle programs with complicated .so's, ctypes, and extensions
  • Code written for hack-ability
  • Simple HDFS access (e.g., reading, writing, ls)
  • Support (and not replicate) the greater Hadoop ecosystem (e.g., Oozie, whirr)
  • Automated job parallelization ‘auto-oozie' available in the hadoopy flow project (maintained out of branch)
  • Local execution of unmodified MapReduce job with launch_local
  • Read/write sequence files of TypedBytes directly to HDFS from python (readtb, writetb)
  • Allows printing to stdout and stderr in Hadoop tasks without causing problems (uses the ‘pipe hopping' technique, both are available in the task's stderr)
  • Works on clusters without any extra installation, Python, or any Python libraries (uses Pyinstaller that is included in this source tree)
  • Works on OS X
  • Critical path is in Cython
  • Simple HDFS access (readtb and ls) inside Python, even inside running jobs
  • Unit test interface
  • Reporting using status and counters (and print statements! no need to be scared of them in Hadoopy)
  • Supports design patterns in the Lin&Dyer book
  • Typedbytes support (very fast)
  • Oozie support

Requirements:

  • Cython 0.13 or higher

Similar Software

Mimeo
Mimeo

6 Mar 16

Firebird
Firebird

10 Feb 16

PhpRedis
PhpRedis

28 Feb 15

Apache Slider
Apache Slider

21 Jul 15

Comments to Hadoopy

Comments not found
Add Comment
Turn on images!