[ https://issues.apache.org/jira/browse/YARN-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16936633#comment-16936633 ]
Adam Antal commented on YARN-9011: ---------------------------------- Thank you for your patch [~pbacsko]. Overall I agree with your resolution, so I'm fine the main idea behind behind this patch. Some thing to consider / improve: - Structurally I'd like propose the following modification in {{DecommissioningNodesSyncer}}. I think a new config e.g. "yarn.resourcemanager.decommission-syncer.wait-ms" would come handy for this class instead of specifying it for each call of {{DecommissioningNodesSyncer$awaitDecommissiningStatus}}. This config can be determined in a new {{init()}} function or in the constructor - the former is the standard in Hadoop, though there are examples of the latter. Also, the latter would cause less harm in the other pieces of code you touched, while an {{init()}} function would cause more, but its your choice. I think the default (10000 ms) is sufficient, you can also add that to {{YarnConfiguration}}. - In {{DecommissioningNodesSyncer}} you use {{ConcurrentLinkedDeque}} as a data structure. As far as I can see the used functions are {{add}}, {{remove}} and {{contains}}. That would make me think a {{ConcurrentHashSet}} (of Guava probably) would be a more suitable choice. - In {{NodeListManager}} I prefer using lock instead of grabbing an object and use that for synchronisation - I would request another 2 more person to take a look at this, but I think the synchronisation is proper. Optionally, I'd suggest to add a reference to this new form of synchronisation to the javadoc of {{ResourceTrackerService}} class, but I realised it did not have javadoc at all. If you're diligent, you can write it, but I assume it's out of scope for this patch. Minor nits: - In the javadoc of {{DecommissioningNodesSyncer}} some @links would be prettier - Also add some javadocs to public functions to that class ({{awaitDecommissiningStatus}} and {{setDecomissioningStatus}} is enough) - Could you add meaningful message in this assert in {{TestResourceTrackerService$testWaitForDecommTransition}}? {code:java} assertEquals("NodeAction", NodeAction.SHUTDOWN, response.getNodeAction()); {code} Thanks! > Race condition during decommissioning > ------------------------------------- > > Key: YARN-9011 > URL: https://issues.apache.org/jira/browse/YARN-9011 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 3.1.1 > Reporter: Peter Bacsko > Assignee: Peter Bacsko > Priority: Major > Attachments: YARN-9011-001.patch, YARN-9011-002.patch, > YARN-9011-003.patch, YARN-9011-004.patch > > > During internal testing, we found a nasty race condition which occurs during > decommissioning. > Node manager, incorrect behaviour: > {noformat} > 2018-06-18 21:00:17,634 WARN > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received > SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting > down. > 2018-06-18 21:00:17,634 WARN > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from > ResourceManager: Disallowed NodeManager nodeId: node-6.hostname.com:8041 > hostname:node-6.hostname.com > {noformat} > Node manager, expected behaviour: > {noformat} > 2018-06-18 21:07:37,377 WARN > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received > SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting > down. > 2018-06-18 21:07:37,377 WARN > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from > ResourceManager: DECOMMISSIONING node-6.hostname.com:8041 is ready to be > decommissioned > {noformat} > Note the two different messages from the RM ("Disallowed NodeManager" vs > "DECOMMISSIONING"). The problem is that {{ResourceTrackerService}} can see an > inconsistent state of nodes while they're being updated: > {noformat} > 2018-06-18 21:00:17,575 INFO > org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: hostsReader > include:{172.26.12.198,node-7.hostname.com,node-2.hostname.com,node-5.hostname.com,172.26.8.205,node-8.hostname.com,172.26.23.76,172.26.22.223,node-6.hostname.com,172.26.9.218,node-4.hostname.com,node-3.hostname.com,172.26.13.167,node-9.hostname.com,172.26.21.221,172.26.10.219} > exclude:{node-6.hostname.com} > 2018-06-18 21:00:17,575 INFO > org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: Gracefully > decommission node node-6.hostname.com:8041 with state RUNNING > 2018-06-18 21:00:17,575 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: > Disallowed NodeManager nodeId: node-6.hostname.com:8041 node: > node-6.hostname.com > 2018-06-18 21:00:17,576 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Put Node > node-6.hostname.com:8041 in DECOMMISSIONING. > 2018-06-18 21:00:17,575 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn > IP=172.26.22.115 OPERATION=refreshNodes TARGET=AdminService > RESULT=SUCCESS > 2018-06-18 21:00:17,577 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Preserve > original total capability: <memory:8192, vCores:8> > 2018-06-18 21:00:17,577 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: > node-6.hostname.com:8041 Node Transitioned from RUNNING to DECOMMISSIONING > {noformat} > When the decommissioning succeeds, there is no output logged from > {{ResourceTrackerService}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org