Dear all,

Good morning,

This is Kiran Kumar Pulamolu, doing research in resource optimization in
Hadoop by dapit resource sharing with fairness policies. I would like
contribute in this group. Please help out how to proceed.

Thankyou
Kiran Kumar Pulamolu
+919492400797

On 24 Jun 2017 5:42 a.m., "Wangda Tan" <[email protected]> wrote:

> Thanks all for working on the feature, I'm in favor of moving forward as
> well.
>
> Best,
> Wangda
>
> On Fri, Jun 23, 2017 at 2:44 PM, Sangjin Lee <[email protected]> wrote:
>
> > Thanks for the clarification Subru. I am in favor of moving forward.
> >
> >
> > Sangjin
> >
> > On Thu, Jun 22, 2017 at 6:21 PM, Karthik Shashank Kambatla <
> > [email protected]> wrote:
> >
> > > Given RTC and the amount of production testing this feature has
> > received, I
> > > am totally in favor of this merge.
> > >
> > >
> > >
> > > On Tue, Jun 20, 2017 at 4:28 PM, Subru Krishnan <[email protected]>
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > We would like to open a discussion on merging the YARN Federation
> > > > (YARN-2915) [1] feature to trunk.  We have been developing the
> feature
> > > in a
> > > > feature branch (YARN-2915 [2]) for a while, and we are reasonably
> > > confident
> > > > that the state of the feature meets the criteria to be merged onto
> > trunk.
> > > >
> > > > *Key Ideas*:
> > > >
> > > > YARN’s centralized design allows strict enforcement of scheduling
> > > > invariants and effective resource sharing, but becomes a scalability
> > > > bottleneck (in number of jobs and nodes) well before reaching the
> scale
> > > of
> > > > our clusters (e.g., 20k-50k nodes).
> > > >
> > > >
> > > > To address these limitations, we developed a scale-out,
> > federation-based
> > > > solution (YARN-2915). Our architecture scales near-linearly to
> > datacenter
> > > > sized clusters, by partitioning nodes across multiple sub-clusters
> > (each
> > > > running a YARN cluster of few thousands nodes). Applications can span
> > > > multiple sub-clusters *transparently (i.e. no code change or
> > > recompilation
> > > > of existing apps)*, thanks to a layer of indirection that negotiates
> > with
> > > > multiple sub-clusters' Resource Managers on behalf of the
> application.
> > > >
> > > >
> > > > This design is structurally scalable, as it bounds the number of
> nodes
> > > each
> > > > RM is responsible for. Appropriate policies ensure that the majority
> of
> > > > applications reside within a single sub-cluster, thus further
> > controlling
> > > > the load on each RM. This provides near linear scale-out by simply
> > adding
> > > > more sub-clusters. The same mechanism enables pooling of resources
> from
> > > > clusters owned and operated by different teams.
> > > >
> > > > Status:
> > > >
> > > >    - The version we would like to merge to trunk is termed "MVP"
> > (minimal
> > > >    viable product). The feature will have a complete end-to-end
> > > application
> > > >    execution flow with the ability to span a single application
> across
> > > >    multiple YARN (sub) clusters.
> > > >    - There were 50+ sub-tasks that were that were completed as part
> of
> > > this
> > > >    effort. Every patch has been reviewed and +1ed by a committer.
> > Thanks
> > > to
> > > >    Jian, Wangda, Karthik, Vinod, Varun & Arun for the thorough
> reviews!
> > > >    - Federation is designed to be built around YARN and consequently
> > has
> > > >    minimal code changes to core YARN. The relevant JIRAs that modify
> > > > existing
> > > >    YARN code base are YARN-3671 [7] & YARN-3673 [8]. We also paid
> close
> > > >    attention to ensure that if federation is disabled there is zero
> > > impact
> > > > to
> > > >    existing functionality (disabled by default).
> > > >    - We found a few bugs as we went along which we fixed directly
> > > upstream
> > > >    in trunk and/or branch-2.
> > > >    - We have continuously rebasing the feature branch [2] so the
> merge
> > > >    should be a straightforward cherry-pick.
> > > >    - The current version has been rather thoroughly tested and is
> > > currently
> > > >    deployed in a *10,000+ node federated YARN cluster that's running
> > > >    upwards of 50k jobs daily with a reliability of 99.9%*.
> > > >    - We have few ideas for follow-up extensions/improvements which
> are
> > > >    tracked in the umbrella JIRA YARN-5597[3].
> > > >
> > > >
> > > > Documentation:
> > > >
> > > >    - Quick start guide (maven site) - YARN-6484[4].
> > > >    - Overall design doc[5] and the slide-deck [6] we used for our
> talk
> > at
> > > >    Hadoop Summit 2016 is available in the umbrella jira - YARN-2915.
> > > >
> > > >
> > > > Credits:
> > > >
> > > > This is a group effort that could have not been possible without the
> > > ideas
> > > > and hard work of many other folks and we would like to specifically
> > call
> > > > out Giovanni, Botong & Ellen for their invaluable contributions. Also
> > big
> > > > thanks to the many folks in community  (Sriram, Kishore, Sarvesh,
> Jian,
> > > > Wangda, Karthik, Vinod, Varun, Inigo, Vrushali, Sangjin, Joep, Rohith
> > and
> > > > many more) that helped us shape our ideas and code with very
> insightful
> > > > feedback and comments.
> > > >
> > > > We plan to start the merge vote in the next week or so. The branch is
> > > close
> > > > to complete (~5 patches before one can kick the tires on a running
> > > > deployment). Please look through the branch; feedback is welcome.
> > Thanks!
> > > >
> > > > Cheers,
> > > > Subru & Carlo
> > > >
> > > > [1] YARN-2915: https://issues.apache.org/jira/browse/YARN-2915
> > > > [2] https://github.com/apache/hadoop/tree/YARN-2915
> > > > [3] YARN-5597: https://issues.apache.org/jira/browse/YARN-5597
> > > > [4] YARN-6484: https://issues.apache.org/jira/browse/YARN-6484
> > > > [5] https://issues.apache.org/jira/secure/attachment/12733292/
> > > > Yarn_federation_design_v1.pdf
> > > > [6] https://issues.apache.org/jira/secure/attachment/1281922
> > > > 9/YARN-Federation-Hadoop-Summit_final.pptx
> > > > [7] YARN-3671: https://issues.apache.org/jira/browse/YARN-3671
> > > > [8] YARN-3673: https://issues.apache.org/jira/browse/YARN-3673
> > > >
> > >
> >
>

Reply via email to