[ https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404963#comment-15404963 ]
Junping Du commented on YARN-4676: ---------------------------------- bq. If NM crashes (for example, JVM exit due to out of heap), it suppose to restart automatically, instead of waiting fur human to start it. Isn't that the general practice? I don't think this is a general case as YARN deployment cases could be various - in many cases (especially at on-premise environment), NM is not supposed to be so fragile and admin need to figure out what's happening before NM crash. Also, even we want to make NM get restart immediately (without human assistant/trouble-shoot), the auto restart logic is outside of YARN but belongs to some cluster deployment/monitor tools like Ambari. Here, we'd better not to have many assumptions. bq. But nothing prevent/disallow the NM daemon from restart, wither automatically or by human. When such NM restart, it will try to register itself to RM, which will be told to shutdown if it still appear in the exclude list. Such node will remain as DECOMMISSIONED inside RM until 10+ minutes later into LOST after the EXPIRE event. As I said above, this belongs to admin's behavior or your monitor tools logic. Just like if an admin is madly to keep starting a NM which belongs to decommissioned node, YARN can do nothing about it but just keep shutdown NM. Such node should always keep as DECOMMISSIONED and I don't see any benefit to move it to EXPIRE status. bq. Such DECOMMISSIONED node can be recommissioned (refreshNodes after it is removed from the exclude list). During which it is transition into RUNNING state. I don't see this hack can bring any benefit, comparing with refreshNode with moving it to include list and restart the NM deamon which will go through normal register process. The risk is we need to take care a separated code path that is dedicated for this minor case. bq. These behavior appears to me as robust instead of hacking. It appears that the behavior you expected relies on a separate mechanism that permanently shutdown NM once it is DECOMMISSIONED. I never hear we need a separate mechanism to shutdown NM once it is decommissioned. It should be built-in behavior for Apache Hadoop YARN so far. Are you talking about a private/specific branch rather than current trunk/branch-2? bq. As long as such DECOMMISSIONED node never try to register or be recommissioned, yes, I expect these transitions you listed could be removed. The re-register of node after taking refreshNode operation is going through the normal register process which is good enough for me. I don't think we need some change here unless we have strong reasons. So. Yes. Please remove these transitions because this is not correct based on current YARN's logic. bq. So I see these transitions are really needed. That said, I could removed them and maintain them privately inside EMR branch for the sake of getting this JIRA going. I can understand the pain point to maintain a private branch - may be standing at your private (EMR) branch, these pieces of code could be needed. However, as a community contributor, you have to switch your roles to stand at community code base in trunk/branch-2, and we committers can only help to get in pieces of code that benefit the whole community. If these piece of code can be important for another story (like resource elasticity of YARN) to benefit the community, we can move it out to another dedicated work but we need to have open discussion on design/implementation ahead - that's the right process for patch/feature contribution. bq. These transitions are there almost single the beginning of this JIRA, any other comments/surprises? These issues already make me surprised enough - these transitions in RMNode belongs to very key logic to YARN, and we need to be careful as always. I need more time to review the rest of code. Hopefully, I can finish my 1st round tomorrow and publish the left comments. > Automatic and Asynchronous Decommissioning Nodes Status Tracking > ---------------------------------------------------------------- > > Key: YARN-4676 > URL: https://issues.apache.org/jira/browse/YARN-4676 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Affects Versions: 2.8.0 > Reporter: Daniel Zhi > Assignee: Daniel Zhi > Labels: features > Attachments: GracefulDecommissionYarnNode.pdf, > GracefulDecommissionYarnNode.pdf, YARN-4676.004.patch, YARN-4676.005.patch, > YARN-4676.006.patch, YARN-4676.007.patch, YARN-4676.008.patch, > YARN-4676.009.patch, YARN-4676.010.patch, YARN-4676.011.patch, > YARN-4676.012.patch, YARN-4676.013.patch, YARN-4676.014.patch, > YARN-4676.015.patch, YARN-4676.016.patch, YARN-4676.017.patch, > YARN-4676.018.patch, YARN-4676.019.patch > > > YARN-4676 implements an automatic, asynchronous and flexible mechanism to > graceful decommission > YARN nodes. After user issues the refreshNodes request, ResourceManager > automatically evaluates > status of all affected nodes to kicks out decommission or recommission > actions. RM asynchronously > tracks container and application status related to DECOMMISSIONING nodes to > decommission the > nodes immediately after there are ready to be decommissioned. Decommissioning > timeout at individual > nodes granularity is supported and could be dynamically updated. The > mechanism naturally supports multiple > independent graceful decommissioning “sessions” where each one involves > different sets of nodes with > different timeout settings. Such support is ideal and necessary for graceful > decommission request issued > by external cluster management software instead of human. > DecommissioningNodeWatcher inside ResourceTrackingService tracks > DECOMMISSIONING nodes status automatically and asynchronously after > client/admin made the graceful decommission request. It tracks > DECOMMISSIONING nodes status to decide when, after all running containers on > the node have completed, will be transitioned into DECOMMISSIONED state. > NodesListManager detect and handle include and exclude list changes to kick > out decommission or recommission as necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org