Junping Du commented on YARN-3212:

Thanks [~leftnoteasy] for review and comments!
bq. 1. Why shutdown a "decommissioning" NM if it is doing heartbeat. Should we 
allow it continue heartbeat, since RM needs to know about container finished / 
killed information.
We don't shutdown a "decommissioning" NM. On the contrary, we differentiates 
nodes in decommissioning from others which get false in 
nodesListManager.isValidNode() check so it can still get running instead of 

bq. 2. Do we have timeout of graceful decomission? Which will update a node to 
"DECOMMISSIONED" after the timeout.
There are some discussions in umbrella JIRA 
 so we decide to track timeout in CLI instead of RM. The CLI patch (YARN-3225) 
also shows that.

bq. 3. If I understand correct, decommissioning is another running state, 
except: We cannot allocate any new containers to it.
Exactly. Another different is available resource should get updated with each 
running container get finished.

bq. If answer to question #2 is no, I suggest to rename 
doesn't have a "real" timeout.
Already replied above that we support timeout in CLI. DECOMISSION_WITH_TIMEOUT 
sounds more clear comparing with old DECOMMISSION event. Thoughts?

bq. Why this is need? .addTransition(NodeState.DECOMMISSIONING, 
If not adding this transition, an InvalidStateTransitionException will get 
thrown in our state machine which sounds not right for a normal operation.

bq. Should we simply ignore the DECOMMISSION_WITH_TIMEOUT event?
No. RM should aware this event so later do some precisely update on available 
resource, etc. (YARN-3223).

bq. Is there specific considerations that transfer UNHEALTHY to DECOMISSIONED 
when DECOMMISSION_WITH_TIMEOUT received? Is it better to transfer it to 
DECOMISSIONING since it has some containers running on it?
I don't have a strong preference in this case. However, my previous 
consideration is UNHEALTHY event comes from machine monitor which indicate the 
node is not quite suitable for containers keep running while 
DECOMMISSION_WITH_TIMEOUT comes from user who is prefer to decommission a batch 
of nodes without affecting app/container running if there are currently running 
*normally*. So I think make it get decommissioned sounds a simpler way before 
we have more operation experience with this new feature. I have similiar view 
on discussion above on UNHEALTHY event to a decommissioning event 
 May be we can retrospect on this later?

bq. One suggestion of how to handle node update to scheduler: I think you can 
add a field "isDecomissioning" to NodeUpdateSchedulerEvent, and scheduler can 
do all updates except allocate container.
Thanks for good suggestion here. YARN-3223 will handle the balance of NM's 
total resource and used resource (so available resource is always 0). So this 
could be an option that we can use this way (new scheduler event) to keep NM 
resource balanced. There are also other options too so we can move the 
discussion to that JIRA I think.

> RMNode State Transition Update with DECOMMISSIONING state
> ---------------------------------------------------------
>                 Key: YARN-3212
>                 URL: https://issues.apache.org/jira/browse/YARN-3212
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: RMNodeImpl - new.png, YARN-3212-v1.patch, 
> YARN-3212-v2.patch, YARN-3212-v3.patch, YARN-3212-v4.1.patch, 
> YARN-3212-v4.patch, YARN-3212-v5.1.patch, YARN-3212-v5.patch
> As proposed in YARN-914, a new state of “DECOMMISSIONING” will be added and 
> can transition from “running” state triggered by a new event - 
> “decommissioning”. 
> This new state can be transit to state of “decommissioned” when 
> Resource_Update if no running apps on this NM or NM reconnect after restart. 
> Or it received DECOMMISSIONED event (after timeout from CLI).
> In addition, it can back to “running” if user decides to cancel previous 
> decommission by calling recommission on the same node. The reaction to other 
> events is similar to RUNNING state.

This message was sent by Atlassian JIRA

Reply via email to