I'll conclude the vote with my obligatory +1. With 8 binding +1s, 2 non-binding +1s and 2 +0s (Daniel's may actually be 0.5 --> 1:)), the vote passes. We'll be merging this to trunk shortly.
Thanks everyone for the taking the time/effort to vote. Cheers, Subru/Carlo On Tue, Aug 1, 2017 at 3:22 PM, [email protected] < [email protected]> wrote: > +1. > Looking forward to it. > > Regards, > Varun Saxena. > > On Wed, Jul 26, 2017 at 8:54 AM, Subru Krishnan <[email protected]> wrote: > > > Hi all, > > > > Per earlier discussion [9], I'd like to start a formal vote to merge > > feature YARN Federation (YARN-2915) [1] to trunk. The vote will run for 7 > > days, and will end Aug 1 7PM PDT. > > > > We have been developing the feature in a branch (YARN-2915 [2]) for a > > while, and we are reasonably confident that the state of the feature > meets > > the criteria to be merged onto trunk. > > > > *Key Ideas*: > > > > YARN’s centralized design allows strict enforcement of scheduling > > invariants and effective resource sharing, but becomes a scalability > > bottleneck (in number of jobs and nodes) well before reaching the scale > of > > our clusters (e.g., 20k-50k nodes). > > > > > > To address these limitations, we developed a scale-out, federation-based > > solution (YARN-2915). Our architecture scales near-linearly to datacenter > > sized clusters, by partitioning nodes across multiple sub-clusters (each > > running a YARN cluster of few thousands nodes). Applications can span > > multiple sub-clusters *transparently (i.e. no code change or > recompilation > > of existing apps)*, thanks to a layer of indirection that negotiates with > > multiple sub-clusters' Resource Managers on behalf of the application. > > > > > > This design is structurally scalable, as it bounds the number of nodes > each > > RM is responsible for. Appropriate policies ensure that the majority of > > applications reside within a single sub-cluster, thus further controlling > > the load on each RM. This provides near linear scale-out by simply adding > > more sub-clusters. The same mechanism enables pooling of resources from > > clusters owned and operated by different teams. > > > > Status: > > > > - The version we would like to merge to trunk is termed "MVP" (minimal > > viable product). The feature will have a complete end-to-end > application > > execution flow with the ability to span a single application across > > multiple YARN (sub) clusters. > > - There were 50+ sub-tasks that were that were completed as part of > this > > effort. Every patch has been reviewed and +1ed by a committer. Thanks > to > > Jian, Wangda, Karthik, Vinod, Varun & Arun for the thorough reviews! > > - Federation is designed to be built around YARN and consequently has > > minimal code changes to core YARN. The relevant JIRAs that modify > > existing > > YARN code base are YARN-3671 [7] & YARN-3673 [8]. We also paid close > > attention to ensure that if federation is disabled there is zero > impact > > to > > existing functionality (disabled by default). > > - We found a few bugs as we went along which we fixed directly > upstream > > in trunk and/or branch-2. > > - We have continuously rebasing the feature branch [2] so the merge > > should be a straightforward cherry-pick. > > - The current version has been rather thoroughly tested and is > currently > > deployed in a *10,000+ node federated YARN cluster that's running > > upwards of 50k jobs daily with a reliability of 99.9%*. > > - We have few ideas for follow-up extensions/improvements which are > > tracked in the umbrella JIRA YARN-5597[3]. > > > > > > Documentation: > > > > - Quick start guide (maven site) - YARN-6484[4]. > > - Overall design doc[5] and the slide-deck [6] we used for our talk at > > Hadoop Summit 2016 is available in the umbrella jira - YARN-2915. > > > > > > Credits: > > > > This is a group effort that could have not been possible without the > ideas > > and hard work of many other folks and we would like to specifically call > > out Giovanni, Botong & Ellen for their invaluable contributions. Also big > > thanks to the many folks in community (Sriram, Kishore, Sarvesh, Jian, > > Wangda, Karthik, Vinod, Varun, Inigo, Vrushali, Sangjin, Joep, Rohith and > > many more) that helped us shape our ideas and code with very insightful > > feedback and comments. > > > > Cheers, > > Subru & Carlo > > > > [1] YARN-2915: https://issues.apache.org/jira/browse/YARN-2915 > > [2] https://github.com/apache/hadoop/tree/YARN-2915 > > [3] YARN-5597: https://issues.apache.org/jira/browse/YARN-5597 > > [4] YARN-6484: https://issues.apache.org/jira/browse/YARN-6484 > > [5] https://issues.apache.org/jira/secure/attachment/12733292/Ya > > rn_federation_design_v1.pdf > > [6] https://issues.apache.org/jira/secure/attachment/1281922 > > 9/YARN-Federation-Hadoop-Summit_final.pptx > > [7] YARN-3671: https://issues.apache.org/jira/browse/YARN-3671 > > [8] YARN-3673: https://issues.apache.org/jira/browse/YARN-3673 > > [9] > > http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201706.mbox/% > > 3CCAOScs9bSsZ7mzH15Y%2BSPDU8YuNUAq7QicjXpDoX_ > > tKh3MS4HsA%40mail.gmail.com%3E > > >
