[ 
https://issues.apache.org/jira/browse/YARN-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16936633#comment-16936633
 ] 

Adam Antal commented on YARN-9011:
----------------------------------

Thank you for your patch [~pbacsko]. Overall I agree with your resolution, so 
I'm fine the main idea behind behind this patch.

Some thing to consider / improve:
- Structurally I'd like propose the following modification in 
{{DecommissioningNodesSyncer}}. I think a new config e.g. 
"yarn.resourcemanager.decommission-syncer.wait-ms" would come handy for this 
class instead of specifying it for each call of 
{{DecommissioningNodesSyncer$awaitDecommissiningStatus}}. This config can be 
determined in a new {{init()}} function or in the constructor - the former is 
the standard in Hadoop, though there are examples of the latter. Also, the 
latter would cause less harm in the other pieces of code you touched, while an 
{{init()}} function would cause more, but its your choice. I think the default 
(10000 ms) is sufficient, you can also add that to {{YarnConfiguration}}.
- In {{DecommissioningNodesSyncer}} you use {{ConcurrentLinkedDeque}} as a data 
structure. As far as I can see the used functions are {{add}}, {{remove}} and 
{{contains}}. That would make me think a {{ConcurrentHashSet}} (of Guava 
probably) would be a more suitable choice.
- In {{NodeListManager}} I prefer using lock instead of grabbing an object and 
use that for synchronisation
- I would request another 2 more person to take a look at this, but I think the 
synchronisation is proper. 

Optionally, I'd suggest to add a reference to this new form of synchronisation 
to the javadoc of {{ResourceTrackerService}} class, but I realised it did not 
have javadoc at all. If you're diligent, you can write it, but I assume it's 
out of scope for this patch.

Minor nits:
- In the javadoc of {{DecommissioningNodesSyncer}} some @links would be prettier
- Also add some javadocs to public functions to that class 
({{awaitDecommissiningStatus}} and {{setDecomissioningStatus}} is enough)
- Could you add meaningful message in this assert in 
{{TestResourceTrackerService$testWaitForDecommTransition}}?
{code:java}
  assertEquals("NodeAction", NodeAction.SHUTDOWN, response.getNodeAction());

{code}

Thanks!

> Race condition during decommissioning
> -------------------------------------
>
>                 Key: YARN-9011
>                 URL: https://issues.apache.org/jira/browse/YARN-9011
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.1.1
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Major
>         Attachments: YARN-9011-001.patch, YARN-9011-002.patch, 
> YARN-9011-003.patch, YARN-9011-004.patch
>
>
> During internal testing, we found a nasty race condition which occurs during 
> decommissioning.
> Node manager, incorrect behaviour:
> {noformat}
> 2018-06-18 21:00:17,634 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received 
> SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting 
> down.
> 2018-06-18 21:00:17,634 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from 
> ResourceManager: Disallowed NodeManager nodeId: node-6.hostname.com:8041 
> hostname:node-6.hostname.com
> {noformat}
> Node manager, expected behaviour:
> {noformat}
> 2018-06-18 21:07:37,377 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received 
> SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting 
> down.
> 2018-06-18 21:07:37,377 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from 
> ResourceManager: DECOMMISSIONING  node-6.hostname.com:8041 is ready to be 
> decommissioned
> {noformat}
> Note the two different messages from the RM ("Disallowed NodeManager" vs 
> "DECOMMISSIONING"). The problem is that {{ResourceTrackerService}} can see an 
> inconsistent state of nodes while they're being updated:
> {noformat}
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: hostsReader 
> include:{172.26.12.198,node-7.hostname.com,node-2.hostname.com,node-5.hostname.com,172.26.8.205,node-8.hostname.com,172.26.23.76,172.26.22.223,node-6.hostname.com,172.26.9.218,node-4.hostname.com,node-3.hostname.com,172.26.13.167,node-9.hostname.com,172.26.21.221,172.26.10.219}
>  exclude:{node-6.hostname.com}
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: Gracefully 
> decommission node node-6.hostname.com:8041 with state RUNNING
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: 
> Disallowed NodeManager nodeId: node-6.hostname.com:8041 node: 
> node-6.hostname.com
> 2018-06-18 21:00:17,576 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Put Node 
> node-6.hostname.com:8041 in DECOMMISSIONING.
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn     
> IP=172.26.22.115        OPERATION=refreshNodes  TARGET=AdminService     
> RESULT=SUCCESS
> 2018-06-18 21:00:17,577 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Preserve 
> original total capability: <memory:8192, vCores:8>
> 2018-06-18 21:00:17,577 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: 
> node-6.hostname.com:8041 Node Transitioned from RUNNING to DECOMMISSIONING
> {noformat}
> When the decommissioning succeeds, there is no output logged from 
> {{ResourceTrackerService}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to