Junping Du commented on YARN-3212:
Thanks [~leftnoteasy] for review and comments!
bq. 1. Why shutdown a "decommissioning" NM if it is doing heartbeat. Should we
allow it continue heartbeat, since RM needs to know about container finished /
We don't shutdown a "decommissioning" NM. On the contrary, we differentiates
nodes in decommissioning from others which get false in
nodesListManager.isValidNode() check so it can still get running instead of
bq. 2. Do we have timeout of graceful decomission? Which will update a node to
"DECOMMISSIONED" after the timeout.
There are some discussions in umbrella JIRA
so we decide to track timeout in CLI instead of RM. The CLI patch (YARN-3225)
also shows that.
bq. 3. If I understand correct, decommissioning is another running state,
except: We cannot allocate any new containers to it.
Exactly. Another different is available resource should get updated with each
running container get finished.
bq. If answer to question #2 is no, I suggest to rename
RMNodeEventType.DECOMISSION_WITH_TIMEOUT to GRACEFUL_DECOMISSION, since it
doesn't have a "real" timeout.
Already replied above that we support timeout in CLI. DECOMISSION_WITH_TIMEOUT
sounds more clear comparing with old DECOMMISSION event. Thoughts?
bq. Why this is need? .addTransition(NodeState.DECOMMISSIONING,
NodeState.DECOMMISSIONING, RMNodeEventType.DECOMMISSION_WITH_TIMEOUT, new
If not adding this transition, an InvalidStateTransitionException will get
thrown in our state machine which sounds not right for a normal operation.
bq. Should we simply ignore the DECOMMISSION_WITH_TIMEOUT event?
No. RM should aware this event so later do some precisely update on available
resource, etc. (YARN-3223).
bq. Is there specific considerations that transfer UNHEALTHY to DECOMISSIONED
when DECOMMISSION_WITH_TIMEOUT received? Is it better to transfer it to
DECOMISSIONING since it has some containers running on it?
I don't have a strong preference in this case. However, my previous
consideration is UNHEALTHY event comes from machine monitor which indicate the
node is not quite suitable for containers keep running while
DECOMMISSION_WITH_TIMEOUT comes from user who is prefer to decommission a batch
of nodes without affecting app/container running if there are currently running
*normally*. So I think make it get decommissioned sounds a simpler way before
we have more operation experience with this new feature. I have similiar view
on discussion above on UNHEALTHY event to a decommissioning event
May be we can retrospect on this later?
bq. One suggestion of how to handle node update to scheduler: I think you can
add a field "isDecomissioning" to NodeUpdateSchedulerEvent, and scheduler can
do all updates except allocate container.
Thanks for good suggestion here. YARN-3223 will handle the balance of NM's
total resource and used resource (so available resource is always 0). So this
could be an option that we can use this way (new scheduler event) to keep NM
resource balanced. There are also other options too so we can move the
discussion to that JIRA I think.
> RMNode State Transition Update with DECOMMISSIONING state
> Key: YARN-3212
> URL: https://issues.apache.org/jira/browse/YARN-3212
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Junping Du
> Assignee: Junping Du
> Attachments: RMNodeImpl - new.png, YARN-3212-v1.patch,
> YARN-3212-v2.patch, YARN-3212-v3.patch, YARN-3212-v4.1.patch,
> YARN-3212-v4.patch, YARN-3212-v5.1.patch, YARN-3212-v5.patch
> As proposed in YARN-914, a new state of “DECOMMISSIONING” will be added and
> can transition from “running” state triggered by a new event -
> This new state can be transit to state of “decommissioned” when
> Resource_Update if no running apps on this NM or NM reconnect after restart.
> Or it received DECOMMISSIONED event (after timeout from CLI).
> In addition, it can back to “running” if user decides to cancel previous
> decommission by calling recommission on the same node. The reaction to other
> events is similar to RUNNING state.
This message was sent by Atlassian JIRA