Job Submission and dynamic provisioning framework for Hadoop Clouds
-------------------------------------------------------------------

                 Key: WHIRR-119
                 URL: https://issues.apache.org/jira/browse/WHIRR-119
             Project: Whirr
          Issue Type: New Feature
          Components: core
    Affects Versions: 0.2.0
            Reporter: Krishna Sankar


A thin framework that can submit a MR job, run it and report results. Some 
thoughts:
# Most probably it will be a server-side daemon 
# JSON over HTTP with REST semantics
# Functions - top level preliminary
## Accept a job and it's components at a well known URL
## Parse & create MR workflow
## Create & store a job context - ID, security artifacts et al
## Return a status URL (can be used to query status or kill the job) This is 
the REST model
## Run the job (might include dynamic elastic cloud provisioning for example 
OpenStack)
## As the job runs, collect and store in the job context
## If client queries return status
## Once job is done, store status and return results (most probably pointers to 
files and so forth)
## Calculate & store performance metrics
## Calculate & store charge back in generic units (eg: 
CPU,Memory,Network,storage
## As and when the client asks, return job results
# Some thoughts on implementation
## Store context et al in HBase
## A Clojure implementation ?
## Packaging like OVF ? (with embedded pointers to VM, data and so forth)
## For 1st release assume a homogeneous Hadoop infrastructure in a cloud
## Customer reporter/context counters?
## Distributed cache for framework artifacts and run time monitoring ?
## Most probably might have to use taskrunner ?
## Extend classes with submission framework setup and teardown code ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to