If this vote is meant for all branches: +1 to merge to trunk +1 to merge to branch-2 +1 to merge to branch-2.6, provided we "label" this feature experimental/alpha until the follow-up items are addressed. -0 to unconditional merge to branch-2.6.
PS: We should decide on the way to communicate the stability of a feature. May be, the new-feature notes in the release documentation should have this label? On Wed, Oct 1, 2014 at 6:23 PM, Karthik Kambatla <[email protected]> wrote: > +1. Nicely done, Subru and Carlo. > > I have been partially involved with the work, and have reviewed some of > the patches. With some help from Subru and documentation from Carlo > (thanks!), I was able to play with the reservation system. Verified the > following: > 1. Reservations can be made only for the amount of resources available for > that queue. > 2. Jobs submitted against a reservation run in the corresponding > "reservation" queue, and jobs submitted to the same higher-level queue but > not against a reservation run in the corresponding "default" queue. > 3. The web-ui shows the reserved resources in a queue even when there are > no apps running. > > There are a few follow-up items towards feature completeness, and I am > okay with working on them post merge to trunk as planned. > 1. Support for FairScheduler > 2. Recover reservations on RM restart/failover > 3. CLI and/or REST APIs to make reservations - this is very useful for > testing > 4. Documentation in the usual apt.vm format. > > Cheers! > Karthik > > > > > On Wed, Oct 1, 2014 at 1:29 PM, Wangda Tan <[email protected]> wrote: > >> +1 (non-binding), >> Reviewed several patches related to scheduler side changes. As Jian >> mentioned, this will not affect existing behavior. >> Looking forward this feature will be used by more people. Thanks for Carlo >> and Subru! >> >> Thanks, >> Wangda >> >> On Wed, Oct 1, 2014 at 1:21 PM, Jian He <[email protected]> wrote: >> >> > +1, >> > >> > Carlo and Subru, great job ! thanks for your contribution ! >> > I reviewed a couple of CapacityScheduler related patches, they are in >> good >> > shape. In the minimum, they are not affecting existing behavior. should >> be >> > safe to merge. >> > >> > Jian >> > >> > >> > On Wed, Oct 1, 2014 at 2:46 AM, Thomas Jungblut <[email protected]> >> > wrote: >> > >> > > +1 (non-binding) >> > > Thanks for adding this, really useful feature. >> > > >> > > On 30 September 2014 19:40, Chris Douglas <[email protected]> >> wrote: >> > > >> > > > +1 >> > > > >> > > > Excellent work, Carlo and Subru. -C >> > > > >> > > > On Fri, Sep 26, 2014 at 11:50 AM, Carlo Curino < >> [email protected]> >> > > > wrote: >> > > > > (Apologies if it is delivered twice.) >> > > > > >> > > > > YARN Devs, >> > > > > >> > > > > We propose to merge YARN-1051 development branch into trunk. >> > > > > >> > > > > Key Idea: >> > > > > This work adds support for Reservations to YARN RM. The key idea >> is >> > to >> > > > allow users to request dedicated access to resources (a >> reservation), >> > > ahead >> > > > of time. >> > > > > For example I can ask for "10 containers for 1 hour sometime >> between >> > > 4pm >> > > > and 9pm today". The RM keeps track of the accepted reservation by >> > means >> > > of >> > > > > a Plan (think it as an agenda on how the cluster resources will >> be >> > > > used), and performs admission control to guarantee that if a >> > reservation >> > > is >> > > > accepted enough >> > > > > resources are set aside to satisfy it. We enforce the reservation >> > > > promises by dynamically creating/resizing/removing queues at the >> right >> > > > time. This allows us >> > > > > to leverage the existing schedulers for the actual container >> > assignment >> > > > and tracking. The key benefit is to expose to the scheduler >> flexibility >> > > of >> > > > allocation, while >> > > > > guaranteeing users predictable resource allocation. >> > > > > >> > > > > Status >> > > > > >> > > > > * The work has been "broken down" into 14 subtasks (+3 >> > patches >> > > > already committed to trunk for move/kill of apps). All the issues >> have >> > > been >> > > > resolved. >> > > > > >> > > > > * Jenkins +1 the patch (with the exception of one test >> > failure >> > > > which we did not introduce, which is tracked here: >> > > > https://issues.apache.org/jira/browse/MAPREDUCE-6094) >> > > > > >> > > > > * Simple integration with MapReduce: >> > > > https://issues.apache.org/jira/browse/MAPREDUCE-6103 >> > > > > >> > > > > * The broken-down patches have been reviewed and +1ed by >> > Vinod >> > > > Kumar Vavilapali, Jian He, Wangda Tan, Karthik Kambatla, and Chris >> > > Douglas. >> > > > Thanks to all of you for the thorough reviews! >> > > > > >> > > > > * The current version has been rather thoroughly tested by >> > > > running it on our 250 machines research cluster for months (first >> > > prototype >> > > > was operational about a year ago) by: >> > > > > >> > > > > o Running hundreds of thousands of job generate by a modified >> > version >> > > > of gridmix that exercise the reservations mechanism side-by-side >> normal >> > > > queues. >> > > > > >> > > > > o To support our integration with the resource estimation >> framework >> > > > Perforator ( >> http://research.microsoft.com/pubs/178971/perforator.pdf). >> > > > Kaushik and Dharmesh have been pounding the reservation system for >> > their >> > > > research for 3-4 months now, and helped us spot few bugs and iron >> them >> > > out. >> > > > > >> > > > > o Code has been inspected/extended by 4-5 other researchers >> which >> > are >> > > > exploring integration with other systems and extensions of our >> > algorithms >> > > > for "reservation placement". >> > > > > >> > > > > * We have few ideas for follow-up extensions/improvements >> are >> > > > tracked by the umbrella JIRA >> > > > https://issues.apache.org/jira/browse/YARN-2572 >> > > > > >> > > > > Documents and Deliverables >> > > > > >> > > > > * This work was accepted for publication to SoCC 2014 >> > > > (pre-camera ready version of the paper here): >> > > > >> > > >> > >> https://issues.apache.org/jira/secure/attachment/12671498/socc14-paper15.pdf >> > > > > >> > > > > * Shorter design doc: >> > > > >> > > >> > >> https://issues.apache.org/jira/secure/attachment/12628330/YARN-1051-design.pdf >> > > > > >> > > > > * Overall patch: >> > > > >> > > >> > >> https://issues.apache.org/jira/secure/attachment/12671361/YARN-1051.1.patch >> > > > > >> > > > > * Per Karthik request we are preparing a small how-to >> > document >> > > > and example code/configuration tracked by >> > > > https://issues.apache.org/jira/browse/YARN-2609 >> > > > > >> > > > > >> > > > > Credits >> > > > > Myself and Subru did lots of the coding (hence the flow of patches >> > from >> > > > us), but this is a group effort that could have not been possible >> > without >> > > > the ideas and hard work of many other >> > > > > folks in our research group (Microsoft-CISL). Major kudos to: >> Chris >> > > > Douglas, Sriram Rao, Raghu Ramakrishnan, and our intern Djellel >> > Difallah. >> > > > Also big thanks to the many folks in community (Arun, Vinod, >> > Alejandro, >> > > > Bikas, Karthik, Sandy, Hitesh, Jakob, Mohammad, Mayank, Jason, >> Bobby, >> > and >> > > > many more) that helped us shape our ideas and code with very >> insightful >> > > > feedback and comments. >> > > > > >> > > > > We expect the vote to run for the usual 7 days and will expire at >> > 12pm >> > > > PDT on Oct 3. Please feel free to reach out to us if you have any >> > > > questions/doubts. >> > > > > >> > > > > Cheers, >> > > > > Carlo & Subru >> > > > > >> > > > >> > > >> > >> > -- >> > CONFIDENTIALITY NOTICE >> > NOTICE: This message is intended for the use of the individual or >> entity to >> > which it is addressed and may contain information that is confidential, >> > privileged and exempt from disclosure under applicable law. If the >> reader >> > of this message is not the intended recipient, you are hereby notified >> that >> > any printing, copying, dissemination, distribution, disclosure or >> > forwarding of this communication is strictly prohibited. If you have >> > received this communication in error, please contact the sender >> immediately >> > and delete it from your system. Thank You. >> > >> > >
