Job Submission and dynamic provisioning framework for Hadoop Clouds
-------------------------------------------------------------------
Key: WHIRR-119
URL: https://issues.apache.org/jira/browse/WHIRR-119
Project: Whirr
Issue Type: New Feature
Components: core
Affects Versions: 0.2.0
Reporter: Krishna Sankar
A thin framework that can submit a MR job, run it and report results. Some
thoughts:
# Most probably it will be a server-side daemon
# JSON over HTTP with REST semantics
# Functions - top level preliminary
## Accept a job and it's components at a well known URL
## Parse & create MR workflow
## Create & store a job context - ID, security artifacts et al
## Return a status URL (can be used to query status or kill the job) This is
the REST model
## Run the job (might include dynamic elastic cloud provisioning for example
OpenStack)
## As the job runs, collect and store in the job context
## If client queries return status
## Once job is done, store status and return results (most probably pointers to
files and so forth)
## Calculate & store performance metrics
## Calculate & store charge back in generic units (eg:
CPU,Memory,Network,storage
## As and when the client asks, return job results
# Some thoughts on implementation
## Store context et al in HBase
## A Clojure implementation ?
## Packaging like OVF ? (with embedded pointers to VM, data and so forth)
## For 1st release assume a homogeneous Hadoop infrastructure in a cloud
## Customer reporter/context counters?
## Distributed cache for framework artifacts and run time monitoring ?
## Most probably might have to use taskrunner ?
## Extend classes with submission framework setup and teardown code ?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.