Catching up on this thread. I'd missed Wangda's comment. This was
merged without testing it end-to-end? And backported to branch-2?

I share Karthik's reservations. This is a very aggressive merging
strategy. Every other feature merged to branch-2 must also help
stabilize YARN-1197. Going through the patch, the particular changes
are not terribly threatening and the passing unit tests are
encouraging. However, a few days' delay to test the feature is not
asking much, particularly when you have reason to be confident in its
stability.

-0, after the fact. -C

On Thu, Sep 24, 2015 at 11:39 AM, Wangda Tan <[email protected]> wrote:
> Trunk build seems stable now after the merge. A couple of yarn trunk builds
> also ran fine such as: https://builds.apache.org/job/Hadoop-Yarn-trunk/1169/
> .
>
> I've merged this into branch-2 so as to minimize YARN dev effort across
> branches as per the discussion thread of YARN-1197 merge.
>
> Thanks,
> Wangda
>
>
> On Wed, Sep 23, 2015 at 3:15 PM, Wangda Tan <[email protected]> wrote:
>
>> Thanks Karthik,
>>
>> Doing end-to-end test as well as regression test is the top priority of
>> this feature. Will keep you updated.
>>
>> Regards,
>> Wangda
>>
>> On Wed, Sep 23, 2015 at 3:01 PM, Karthik Kambatla <[email protected]>
>> wrote:
>>
>>> I did skim through the patch before voting :)
>>>
>>> As I mentioned earlier, I understand vast majority of the code is net new,
>>> and the likelihood of breaking changes is low. Still, I would like for us
>>> to be more careful and run some end-to-end and regression tests before
>>> including this in a release. While I won't block the merge to branch-2,
>>> will be a -1 on cutting a release without due testing.
>>>
>>> At the risk of digressing, I also feel branch-2 should be in a releasable
>>> state more often than not. The only way to bring any cadence and
>>> predictability to our releases is by maintaining branch-2 in such a state.
>>>
>>> On Tue, Sep 22, 2015 at 8:14 PM, Wangda Tan <[email protected]> wrote:
>>>
>>> > Hi Karthik,
>>> >
>>> > Let me elaborate more to make you feel better of this change, don't be
>>> > scared by the size of the patch :)
>>> >
>>> > Common RM/Scheduler part:
>>> > - AbstractYarnScheduler new logic only.
>>> > - AppSchedulingInfo new logic only.
>>> > - RMContainer / RMNode state machine, new logic only
>>> > - SchedulerApplicationAttempt / Allocation, refactoring to existing
>>> > reservation logic so increase request reservation can reuse it, and
>>> > refactored to simply updating container token / pull container part so
>>> > increase/decrease/new-allocation can reuse same code.
>>> >
>>> > FairScheduler:
>>> > - Small change since we updated how to pull container updated token. I
>>> > believe it will be a straightforward change for you if you take a closer
>>> > look at it.
>>> >
>>> > CapacityScheduler:
>>> > - Most changes are separate logic or small refactorings, most complex
>>> > allocation logic stays within IncreaseContainerAllocator.java.
>>> >
>>> > Please let me know where you want to get more details of
>>> implementations.
>>> >
>>> > I strongly suggest you to take a glance at the diff, we have already
>>> worked
>>> > on the merge for the past one week, and we've paid a lot of extra time
>>> to
>>> > keep YARN-1197 sync with trunk in the past several months. After this
>>> merge
>>> > finished, a couple of weeks needed to finish end-to-end test and some
>>> other
>>> > extra tests, it won't affect our upcoming branch-2 release.
>>> >
>>> > I would not prefer to merge to trunk only, all people working on RM side
>>> > will be affected, we're very carefully avoid such divergence of RM in
>>> > trunk/branch-2. Since nobody wants to create two different patches for
>>> > every RM changes. And also, after this finished, other efforts can
>>> happen
>>> > in parallel such as YARN-4091.
>>> >
>>> > Let me know if you have any other questions/concerns.
>>> >
>>> > Thanks,
>>> > Wangda
>>> >
>>> > On Tue, Sep 22, 2015 at 7:26 PM, Karthik Kambatla <[email protected]>
>>> > wrote:
>>> >
>>> > > I am sorry, but merging a potentially disruptive change to branch-2
>>> > without
>>> > > end-to-end tests seems too disruptive to me.
>>> > >
>>> > > I do agree with you on the potential inconvenience of having to post
>>> > > different patches for trunk and branch-2, but I would rather have that
>>> > > inconvenience than the risk of merging something that hasn't been
>>> > > thoroughly tested.
>>> > >
>>> > > On Tue, Sep 22, 2015 at 6:18 PM, Wangda Tan <[email protected]>
>>> wrote:
>>> > >
>>> > > > Hi Karthik,
>>> > > >
>>> > > > Thanks for comments! However, I think only merge to trunk may not
>>> work,
>>> > > > this patch involves thousands lines of code changes in scheduler
>>> side,
>>> > > only
>>> > > > putting that to trunk could lead to trunk/branch-2 totally
>>> incompatible
>>> > > for
>>> > > > resource manager. I think most of the code changes are new to
>>> scheduler
>>> > > > instead of modifying existed logic, they're not very tricky to me.
>>> And
>>> > > when
>>> > > > 2.8 will be released is not planned yet, at least we have a couple
>>> of
>>> > > > months to make sure this feature becomes usable and not cause
>>> existing
>>> > > > behavior regressions.
>>> > > >
>>> > > > Sounds good to you?
>>> > > >
>>> > > > Wangda
>>> > > >
>>> > > >
>>> > > > On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla <
>>> [email protected]>
>>> > > > wrote:
>>> > > >
>>> > > > > +1 on merging to trunk. It would be nice to have some amount of
>>> > testing
>>> > > > > done before the merge, but I understand how merging to trunk would
>>> > > likely
>>> > > > > speed up the testing efforts.
>>> > > > >
>>> > > > > Let us not merge into branch-2 until after we have done a fair
>>> bit of
>>> > > > > testing, and are comfortable including it in a release. While the
>>> > code
>>> > > > > mostly appears to not mess with existing scheduling logic, I am
>>> > > concerned
>>> > > > > about regressions to existing scheduling behavior.
>>> > > > >
>>> > > > >
>>> > > > >
>>> > > > > On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla <
>>> > [email protected]>
>>> > > > > wrote:
>>> > > > >
>>> > > > > > By the way, for the purposes of merge vote, I believe a
>>> committer's
>>> > > > vote
>>> > > > > > is binding. So, Wangda and Zhihai's votes should be binding. :)
>>> > > > > >
>>> > > > > > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <[email protected]>
>>> > > wrote:
>>> > > > > >
>>> > > > > >> +1 (non-binding)
>>> > > > > >>
>>> > > > > >> thanks
>>> > > > > >> Zhihai Xu
>>> > > > > >>
>>> > > > > >> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <
>>> > [email protected]>
>>> > > > > >> wrote:
>>> > > > > >>
>>> > > > > >> > +1 Binding
>>> > > > > >> >
>>> > > > > >> > Thanks
>>> > > > > >> >
>>> > > > > >> > Xuan Gong
>>> > > > > >> >
>>> > > > > >> > > On Sep 22, 2015, at 12:03 AM, Junping Du <
>>> [email protected]
>>> > >
>>> > > > > wrote:
>>> > > > > >> > >
>>> > > > > >> > > +1. (Binding).
>>> > > > > >> > >
>>> > > > > >> > > Thanks,
>>> > > > > >> > >
>>> > > > > >> > > Junping
>>> > > > > >> > > ________________________________________
>>> > > > > >> > > From: Wangda Tan <[email protected]>
>>> > > > > >> > > Sent: Thursday, September 17, 2015 3:19 AM
>>> > > > > >> > > To: [email protected]
>>> > > > > >> > > Subject: Re: [VOTE] Merge YARN-1197 container resize into
>>> > trunk
>>> > > > > >> > >
>>> > > > > >> > > +1 (non-binding),
>>> > > > > >> > >
>>> > > > > >> > > Thanks Jian starting this thread. This can minimize effort
>>> of
>>> > > > works
>>> > > > > >> > across branches.
>>> > > > > >> > >
>>> > > > > >> > > To clarify, this feature is end-to-end code  completed, we
>>> > have
>>> > > > API,
>>> > > > > >> > rm/nm implementations patches committed, but we haven't
>>> tested
>>> > it
>>> > > > > >> > end-to-end. Filed YARN-4175 to create an example program to
>>> test
>>> > > it
>>> > > > > >> > end-to-end.
>>> > > > > >> > >
>>> > > > > >> > > Regards,
>>> > > > > >> > > Wangda
>>> > > > > >> > >
>>> > > > > >> > >> On Sep 16, 2015, at 6:30 PM, Jian He <[email protected]
>>> >
>>> > > > wrote:
>>> > > > > >> > >>
>>> > > > > >> > >> Hi All,
>>> > > > > >> > >>
>>> > > > > >> > >> Thanks Meng Ding and Wangda Tan for all the hard work !
>>> > > > > >> > >>
>>> > > > > >> > >> I would like to call a vote to merge YARN-1197 container
>>> > resize
>>> > > > > into
>>> > > > > >> > trunk.
>>> > > > > >> > >>
>>> > > > > >> > >> Key idea:
>>> > > > > >> > >> This feature adds the ability for AM to change container
>>> > > resource
>>> > > > > >> size
>>> > > > > >> > at runtime.
>>> > > > > >> > >>
>>> > > > > >> > >> Details:
>>> > > > > >> > >> - This feature is tracked at
>>> > > > > >> > https://issues.apache.org/jira/browse/YARN-1197
>>> > > > > >> > >> - It’s currently developed at a separate branch:
>>> > > > > >> > https://github.com/apache/hadoop/commits/YARN-1197
>>> > > > > >> > >> - A uber patch(
>>> > https://issues.apache.org/jira/browse/YARN-4157
>>> > > )
>>> > > > > >> > generated from YARN-1197 to run against trunk  shows all unit
>>> > > tests
>>> > > > > have
>>> > > > > >> > passed.
>>> > > > > >> > >> - This feature now can work end-to-end.
>>> > > > > >> > >> -  All the unresolved jiras under YARN-1197 will be the
>>> next
>>> > > > step.
>>> > > > > >> > >>
>>> > > > > >> > >> Thanks,
>>> > > > > >> > >> Wangda Tan & Meng Ding & Jian He
>>> > > > > >> > >
>>> > > > > >> > >
>>> > > > > >> >
>>> > > > > >> >
>>> > > > > >>
>>> > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>
>>

Reply via email to