[
https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404963#comment-15404963
]
Junping Du commented on YARN-4676:
----------------------------------
bq. If NM crashes (for example, JVM exit due to out of heap), it suppose to
restart automatically, instead of waiting fur human to start it. Isn't that the
general practice?
I don't think this is a general case as YARN deployment cases could be various
- in many cases (especially at on-premise environment), NM is not supposed to
be so fragile and admin need to figure out what's happening before NM crash.
Also, even we want to make NM get restart immediately (without human
assistant/trouble-shoot), the auto restart logic is outside of YARN but belongs
to some cluster deployment/monitor tools like Ambari. Here, we'd better not to
have many assumptions.
bq. But nothing prevent/disallow the NM daemon from restart, wither
automatically or by human. When such NM restart, it will try to register itself
to RM, which will be told to shutdown if it still appear in the exclude list.
Such node will remain as DECOMMISSIONED inside RM until 10+ minutes later into
LOST after the EXPIRE event.
As I said above, this belongs to admin's behavior or your monitor tools logic.
Just like if an admin is madly to keep starting a NM which belongs to
decommissioned node, YARN can do nothing about it but just keep shutdown NM.
Such node should always keep as DECOMMISSIONED and I don't see any benefit to
move it to EXPIRE status.
bq. Such DECOMMISSIONED node can be recommissioned (refreshNodes after it is
removed from the exclude list). During which it is transition into RUNNING
state.
I don't see this hack can bring any benefit, comparing with refreshNode with
moving it to include list and restart the NM deamon which will go through
normal register process. The risk is we need to take care a separated code path
that is dedicated for this minor case.
bq. These behavior appears to me as robust instead of hacking. It appears that
the behavior you expected relies on a separate mechanism that permanently
shutdown NM once it is DECOMMISSIONED.
I never hear we need a separate mechanism to shutdown NM once it is
decommissioned. It should be built-in behavior for Apache Hadoop YARN so far.
Are you talking about a private/specific branch rather than current
trunk/branch-2?
bq. As long as such DECOMMISSIONED node never try to register or be
recommissioned, yes, I expect these transitions you listed could be removed.
The re-register of node after taking refreshNode operation is going through the
normal register process which is good enough for me. I don't think we need some
change here unless we have strong reasons. So. Yes. Please remove these
transitions because this is not correct based on current YARN's logic.
bq. So I see these transitions are really needed. That said, I could removed
them and maintain them privately inside EMR branch for the sake of getting this
JIRA going.
I can understand the pain point to maintain a private branch - may be standing
at your private (EMR) branch, these pieces of code could be needed. However, as
a community contributor, you have to switch your roles to stand at community
code base in trunk/branch-2, and we committers can only help to get in pieces
of code that benefit the whole community. If these piece of code can be
important for another story (like resource elasticity of YARN) to benefit the
community, we can move it out to another dedicated work but we need to have
open discussion on design/implementation ahead - that's the right process for
patch/feature contribution.
bq. These transitions are there almost single the beginning of this JIRA, any
other comments/surprises?
These issues already make me surprised enough - these transitions in RMNode
belongs to very key logic to YARN, and we need to be careful as always. I need
more time to review the rest of code. Hopefully, I can finish my 1st round
tomorrow and publish the left comments.
> Automatic and Asynchronous Decommissioning Nodes Status Tracking
> ----------------------------------------------------------------
>
> Key: YARN-4676
> URL: https://issues.apache.org/jira/browse/YARN-4676
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Affects Versions: 2.8.0
> Reporter: Daniel Zhi
> Assignee: Daniel Zhi
> Labels: features
> Attachments: GracefulDecommissionYarnNode.pdf,
> GracefulDecommissionYarnNode.pdf, YARN-4676.004.patch, YARN-4676.005.patch,
> YARN-4676.006.patch, YARN-4676.007.patch, YARN-4676.008.patch,
> YARN-4676.009.patch, YARN-4676.010.patch, YARN-4676.011.patch,
> YARN-4676.012.patch, YARN-4676.013.patch, YARN-4676.014.patch,
> YARN-4676.015.patch, YARN-4676.016.patch, YARN-4676.017.patch,
> YARN-4676.018.patch, YARN-4676.019.patch
>
>
> YARN-4676 implements an automatic, asynchronous and flexible mechanism to
> graceful decommission
> YARN nodes. After user issues the refreshNodes request, ResourceManager
> automatically evaluates
> status of all affected nodes to kicks out decommission or recommission
> actions. RM asynchronously
> tracks container and application status related to DECOMMISSIONING nodes to
> decommission the
> nodes immediately after there are ready to be decommissioned. Decommissioning
> timeout at individual
> nodes granularity is supported and could be dynamically updated. The
> mechanism naturally supports multiple
> independent graceful decommissioning “sessions” where each one involves
> different sets of nodes with
> different timeout settings. Such support is ideal and necessary for graceful
> decommission request issued
> by external cluster management software instead of human.
> DecommissioningNodeWatcher inside ResourceTrackingService tracks
> DECOMMISSIONING nodes status automatically and asynchronously after
> client/admin made the graceful decommission request. It tracks
> DECOMMISSIONING nodes status to decide when, after all running containers on
> the node have completed, will be transitioned into DECOMMISSIONED state.
> NodesListManager detect and handle include and exclude list changes to kick
> out decommission or recommission as necessary.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]