Junping Du commented on YARN-914:

Hi [~danzhi], thanks for sharing the information above and welcome to join the 
contribution to Apache Hadoop.

bq. Our implementation is much in sync with the architecture and idea in the 
JIRA design document.
Good to hear that we are on the same page. One thing we need to pay attention 
is: we already have many patches committed into trunk/branch-2.8. As an 
continuous developing effort on YARN, we need to remove the code (current 
internal to yourself) for similar functionality or APIs before contributing or 
it would take reviewer/committer more effort to differentiate which 
functionalities/APIs are duplicated and which are not - that usually take much 
longer time.

bq. On the other hand, there are additional details and component level designs 
that the JIRA design document not necessarily discuss or touch. These details 
naturally surfaced up during the development iterations and the corresponding 
design became matured and stabilized.
I agree that the design document could miss some details of implementation in 
general. However, we can find more background/details in JIRA discussion or 
patch implementation. Let me explain below.

bq. One example is the DecommissioningNodeWatcher, which embedded in 
ResourceTrackingService, tracks DECOMMISSIONING nodes status automatically and 
asynchronously after client/admin made the graceful decommission request. 
Another example is per node decommission timeout support, which is useful to 
decommission node that will be terminated soon.
Actually, our current design and committed patches already support timeout 
feature. There are basically two ways to handle timeout: RM side or CLI side, 
both have pros and cons.
Per disussions above 
 we (Jason, Vinod and I) all agreed to go with CLI way first and we already 
implement it in sub JIRA (YARN-3225) and get committed. Of course, we are open 
for the other way of implementation, but we do want it can be based on a switch 
on/off configuration that doesn't affect current preferred option that we 
already implemented.

bq. Are you able to share these details in an "augmented" design doc? Agreeing 
on the design would greatly help with review/commits later.
I would prefer the effort to abstract the different implementation for 
tracking/handling timeout. This doesn't sounds like a overall "augmented" 
design as prevous saying it "much in sync" with current architecture and 
design. Also it is more proper to create a sub jira to discuss your ideas and 
put your document there given we already have a very long discussion here on 
overall design.

bq. As far as implementation goes, it is recommended to create subtasks as you 
see fit. Note that it is easier to review smaller chunks of code. Also, since 
you guys have implemented it already, can you comment on how much of the code 
changes are in frequently updated parts? If not much, it might make sense to 
develop on a branch and merge it to trunk.
I would say most parts of YARN-914 are already get committed or patch available 
already. It doesn't sounds massive of work for enhancing the timeout 
tracking/handling here, so a dedicated develop branch sounds unnecessary to me. 
However, I would prefer to create a sub jira to discuss the idea/scope and take 
a look at your demo code (with removing the duplicated code/feature that 
already committed or patch available public) before making any 

[~danzhi], the concrete steps I would suggest for now is:
1. Review all JIRA discussions/design doc/implementations under this umbrella 
JIRA so far, and understand the scope and gap with your current internal 
2. Raise a sub jira to put your ideas/design to highlight different options for 
discussion. If possible, put a demo patch with removing any similar code or 
feature on existing patches for better understanding. We can discuss later on 
how to bring in your patch contribution.
Make sense?

> (Umbrella) Support graceful decommission of nodemanager
> -------------------------------------------------------
>                 Key: YARN-914
>                 URL: https://issues.apache.org/jira/browse/YARN-914
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: graceful
>    Affects Versions: 2.0.4-alpha
>            Reporter: Luke Lu
>            Assignee: Junping Du
>         Attachments: Gracefully Decommission of NodeManager (v1).pdf, 
> Gracefully Decommission of NodeManager (v2).pdf, 
> GracefullyDecommissionofNodeManagerv3.pdf
> When NMs are decommissioned for non-fault reasons (capacity change etc.), 
> it's desirable to minimize the impact to running applications.
> Currently if a NM is decommissioned, all running containers on the NM need to 
> be rescheduled on other NMs. Further more, for finished map tasks, if their 
> map output are not fetched by the reducers of the job, these map tasks will 
> need to be rerun as well.
> We propose to introduce a mechanism to optionally gracefully decommission a 
> node manager.

This message was sent by Atlassian JIRA

Reply via email to