[
https://issues.apache.org/jira/browse/YARN-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15038283#comment-15038283
]
Kuhu Shukla commented on YARN-4386:
-----------------------------------
[~sunilg], [~djp]. Request for comments on how to test this since even during
the transition, the RMNode is removed from active list first and then put in
the inactive RMNode list. Unless there are 2 refreshNodes done in parallel such
that the first deactivateNodeTransition has not finished and the other
refreshNodes is also trying to do the same transition, only one of them would
succeed and this would not be a race (?). Let me know if that makes sense.
> refreshNodesGracefully() looks at active RMNode list for recommissioning
> decommissioned nodes
> ---------------------------------------------------------------------------------------------
>
> Key: YARN-4386
> URL: https://issues.apache.org/jira/browse/YARN-4386
> Project: Hadoop YARN
> Issue Type: Bug
> Components: graceful
> Affects Versions: 3.0.0
> Reporter: Kuhu Shukla
> Assignee: Kuhu Shukla
> Priority: Minor
> Attachments: YARN-4386-v1.patch
>
>
> In refreshNodesGracefully(), during recommissioning, the entryset from
> getRMNodes() which has only active nodes (RUNNING, DECOMMISSIONING etc.) is
> used for checking 'decommissioned' nodes which are present in
> getInactiveRMNodes() map alone.
> {code}
> for (Entry<NodeId, RMNode> entry:rmContext.getRMNodes().entrySet()) {
> .........................
> // Recommissioning the nodes
> if (entry.getValue().getState() == NodeState.DECOMMISSIONING
> || entry.getValue().getState() == NodeState.DECOMMISSIONED) {
> this.rmContext.getDispatcher().getEventHandler()
> .handle(new RMNodeEvent(nodeId, RMNodeEventType.RECOMMISSION));
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)