Rohith commented on YARN-3222:

bq. NODE_USABLE event is sent regardless the reconnected node is healthy or not 
healthy, which is incorrect, right ?
Yes, I think it was assumed like if new node is reconnecting then NM is 
healthy. It is better to retain the old state i.e UNHEALTHY and in the next 1st 
heartbeat NodeStatus can be moved from Unhealthy to Running.

I see another potential issue that if old node is retaining then RMnode has to 
be updated {{totalCapability}} with new RMNode resource.  But in flow, 
{{totalCapability}} is not updated. This result , scheduler has updated 
resources value but RMNode has stale memory. Any client getting RMnode 
capabilit from RMnode would end up in wrong node resource value.
if (noRunningApps) {
// some code        
            new NodeRemovedSchedulerEvent(rmNode));
        if (rmNode.getHttpPort() == newNode.getHttpPort()) {
           if (rmNode.getState() != NodeState.UNHEALTHY) {
            // Only add new node if old state is not UNHEALTHY
                new NodeAddedSchedulerEvent(newNode));  // NEW NODE CAPABILITY 
        } else {
          // Reconnected node differs, so replace old node and start new node
                new RMNodeStartedEvent(newNode.getNodeID(), null, null)); // No 
need to update totalCapability since old node is replaced with new node.

> RMNodeImpl#ReconnectNodeTransition should send scheduler events in sequential 
> order
> -----------------------------------------------------------------------------------
>                 Key: YARN-3222
>                 URL: https://issues.apache.org/jira/browse/YARN-3222
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: Rohith
>            Assignee: Rohith
>            Priority: Critical
>         Attachments: 0001-YARN-3222.patch
> When a node is reconnected,RMNodeImpl#ReconnectNodeTransition notifies the 
> scheduler in a events node_added,node_removed or node_resource_update. These 
> events should be notified in an sequential order i.e node_added event and 
> next node_resource_update events.
> But if the node is reconnected with different http port, the oder of 
> scheduler events are node_removed --> node_resource_update --> node_added 
> which causes scheduler does not find the node and throw NPE and RM exit.
> Node_Resource_update event should be always should be triggered via 

This message was sent by Atlassian JIRA

Reply via email to