[ 
https://issues.apache.org/jira/browse/YARN-1928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966301#comment-13966301
 ] 

Zhijie Shen commented on YARN-1928:
-----------------------------------

There's a race condition in the following code
{code}
    MockNM nm1 = rm.registerNode("127.0.0.1:1234", 10000);
    MockNM nm2 = rm.registerNode("127.0.0.2:1234", 10000);
    MockNM nm3 = rm.registerNode("127.0.0.3:1234", 10000);
    MockNM nm4 = rm.registerNode("127.0.0.4:1234", 10000);

    RMApp app1 = rm.submitApp(2000);
{code}

The app will be already put in RM context before the added nodes trigger 
NODE_USABLE to make it be added in each app's updatedNodes. See the following 
trace, which shows the app already exists before all NODE_USABLEs. That's why 
the first allocate will show 4 updated nodes.

{code}
014-04-10 02:04:51,931 INFO  [main] resourcemanager.RMAuditLogger 
(RMAuditLogger.java:logSuccess(142)) - USER=jenkins   OPERATION=Submit 
Application Request    TARGET=ClientRMService  RESULT=SUCCESS  
APPID=application_1397120691650_0001
App : application_1397120691650_0001 State is : NEW Waiting for state : ACCEPTED
2014-04-10 02:04:51,951 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeEvent.EventType: 
STARTED
2014-04-10 02:04:51,951 DEBUG [AsyncDispatcher event handler] rmnode.RMNodeImpl 
(RMNodeImpl.java:handle(373)) - Processing 127.0.0.1:1234 of type STARTED
2014-04-10 02:04:51,953 INFO  [AsyncDispatcher event handler] rmnode.RMNodeImpl 
(RMNodeImpl.java:handle(385)) - 127.0.0.1:1234 Node Transitioned from NEW to 
RUNNING
2014-04-10 02:04:51,954 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeEvent.EventType: 
STARTED
2014-04-10 02:04:51,954 DEBUG [AsyncDispatcher event handler] rmnode.RMNodeImpl 
(RMNodeImpl.java:handle(373)) - Processing 127.0.0.2:1234 of type STARTED
2014-04-10 02:04:51,955 INFO  [AsyncDispatcher event handler] rmnode.RMNodeImpl 
(RMNodeImpl.java:handle(385)) - 127.0.0.2:1234 Node Transitioned from NEW to 
RUNNING
2014-04-10 02:04:51,955 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeEvent.EventType: 
STARTED
2014-04-10 02:04:51,955 DEBUG [AsyncDispatcher event handler] rmnode.RMNodeImpl 
(RMNodeImpl.java:handle(373)) - Processing 127.0.0.3:1234 of type STARTED
2014-04-10 02:04:51,955 INFO  [AsyncDispatcher event handler] rmnode.RMNodeImpl 
(RMNodeImpl.java:handle(385)) - 127.0.0.3:1234 Node Transitioned from NEW to 
RUNNING
2014-04-10 02:04:51,955 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeEvent.EventType: 
STARTED
2014-04-10 02:04:51,955 DEBUG [AsyncDispatcher event handler] rmnode.RMNodeImpl 
(RMNodeImpl.java:handle(373)) - Processing 127.0.0.4:1234 of type STARTED
2014-04-10 02:04:51,955 INFO  [AsyncDispatcher event handler] rmnode.RMNodeImpl 
(RMNodeImpl.java:handle(385)) - 127.0.0.4:1234 Node Transitioned from NEW to 
RUNNING
2014-04-10 02:04:51,955 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppEvent.EventType: 
START
2014-04-10 02:04:51,957 DEBUG [AsyncDispatcher event handler] rmapp.RMAppImpl 
(RMAppImpl.java:handle(627)) - Processing event for 
application_1397120691650_0001 of type START
2014-04-10 02:04:51,957 INFO  [AsyncDispatcher event handler] rmapp.RMAppImpl 
(RMAppImpl.java:transition(863)) - Storing application with id 
application_1397120691650_0001
2014-04-10 02:04:51,959 INFO  [AsyncDispatcher event handler] rmapp.RMAppImpl 
(RMAppImpl.java:handle(639)) - application_1397120691650_0001 State change from 
NEW to NEW_SAVING
2014-04-10 02:04:51,959 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.NodeAddedSchedulerEvent.EventType:
 NODE_ADDED
2014-04-10 02:04:51,961 INFO  [AsyncDispatcher event handler] 
capacity.CapacityScheduler (CapacityScheduler.java:addNode(936)) - Added node 
127.0.0.1:1234 clusterResource: <memory:10000, vCores:9>
2014-04-10 02:04:51,961 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.NodesListManagerEvent.EventType: 
NODE_USABLE
2014-04-10 02:04:51,963 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.NodeAddedSchedulerEvent.EventType:
 NODE_ADDED
2014-04-10 02:04:51,964 INFO  [AsyncDispatcher event handler] 
capacity.CapacityScheduler (CapacityScheduler.java:addNode(936)) - Added node 
127.0.0.2:1234 clusterResource: <memory:20000, vCores:18>
2014-04-10 02:04:51,964 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.NodesListManagerEvent.EventType: 
NODE_USABLE
2014-04-10 02:04:51,964 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.NodeAddedSchedulerEvent.EventType:
 NODE_ADDED
2014-04-10 02:04:51,965 INFO  [AsyncDispatcher event handler] 
capacity.CapacityScheduler (CapacityScheduler.java:addNode(936)) - Added node 
127.0.0.3:1234 clusterResource: <memory:30000, vCores:27>
2014-04-10 02:04:51,965 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.NodesListManagerEvent.EventType: 
NODE_USABLE
2014-04-10 02:04:51,965 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.NodeAddedSchedulerEvent.EventType:
 NODE_ADDED
2014-04-10 02:04:51,965 INFO  [AsyncDispatcher event handler] 
capacity.CapacityScheduler (CapacityScheduler.java:addNode(936)) - Added node 
127.0.0.4:1234 clusterResource: <memory:40000, vCores:36>
2014-04-10 02:04:51,965 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.NodesListManagerEvent.EventType: 
NODE_USABLE
2014-04-10 02:04:51,965 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppNodeUpdateEvent.EventType:
 NODE_UPDATE
2014-04-10 02:04:51,965 DEBUG [AsyncDispatcher event handler] rmapp.RMAppImpl 
(RMAppImpl.java:handle(627)) - Processing event for 
application_1397120691650_0001 of type NODE_UPDATE
2014-04-10 02:04:51,967 DEBUG [AsyncDispatcher event handler] rmapp.RMAppImpl 
(RMAppImpl.java:processNodeUpdate(684)) - Received node update 
event:NODE_USABLE for node:127.0.0.1:1234 with state:RUNNING
2014-04-10 02:04:51,967 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppNodeUpdateEvent.EventType:
 NODE_UPDATE
2014-04-10 02:04:51,993 DEBUG [AsyncDispatcher event handler] rmapp.RMAppImpl 
(RMAppImpl.java:handle(627)) - Processing event for 
application_1397120691650_0001 of type NODE_UPDATE
2014-04-10 02:04:51,993 DEBUG [AsyncDispatcher event handler] rmapp.RMAppImpl 
(RMAppImpl.java:processNodeUpdate(684)) - Received node update 
event:NODE_USABLE for node:127.0.0.2:1234 with state:RUNNING
2014-04-10 02:04:51,993 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppNodeUpdateEvent.EventType:
 NODE_UPDATE
2014-04-10 02:04:51,993 DEBUG [AsyncDispatcher event handler] rmapp.RMAppImpl 
(RMAppImpl.java:handle(627)) - Processing event for 
application_1397120691650_0001 of type NODE_UPDATE
2014-04-10 02:04:51,993 DEBUG [AsyncDispatcher event handler] rmapp.RMAppImpl 
(RMAppImpl.java:processNodeUpdate(684)) - Received node update 
event:NODE_USABLE for node:127.0.0.3:1234 with state:RUNNING
2014-04-10 02:04:51,993 DEBUG [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the 
event 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppNodeUpdateEvent.EventType:
 NODE_UPDATE
2014-04-10 02:04:51,993 DEBUG [AsyncDispatcher event handler] rmapp.RMAppImpl 
(RMAppImpl.java:handle(627)) - Processing event for 
application_1397120691650_0001 of type NODE_UPDATE
2014-04-10 02:04:51,993 DEBUG [AsyncDispatcher event handler] rmapp.RMAppImpl 
(RMAppImpl.java:processNodeUpdate(684)) - Received node update 
event:NODE_USABLE for node:127.0.0.4:1234 with state:RUNNING
{code}

> TestAMRMRPCNodeUpdates fails ocassionally
> -----------------------------------------
>
>                 Key: YARN-1928
>                 URL: https://issues.apache.org/jira/browse/YARN-1928
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Zhijie Shen
>            Assignee: Zhijie Shen
>
> {code}
> junit.framework.AssertionFailedError: expected:<0> but was:<4>
>       at junit.framework.Assert.fail(Assert.java:50)
>       at junit.framework.Assert.failNotEquals(Assert.java:287)
>       at junit.framework.Assert.assertEquals(Assert.java:67)
>       at junit.framework.Assert.assertEquals(Assert.java:199)
>       at junit.framework.Assert.assertEquals(Assert.java:205)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates.testAMRMUnusableNodes(TestAMRMRPCNodeUpdates.java:136)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to