[
https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068939#comment-15068939
]
Daniel Templeton commented on YARN-4311:
----------------------------------------
In the constructor, you check if the inactive nodes list entries are null
before using them. Is that necessary? Further down in {{refreshNodes()}} you
don't. Same thing in {{refreshNodesGracefully()}}.
Inside the new for loop in {{refreshNodes()}} you should combine the nested ifs
into a single conditional.
The multiple ifs and returns in {{isUntracked()}} should be combined:
{code}
public boolean isUntrackedNode(String hostName) {
boolean untracked = false;
String ip = resolver.resolve(hostName);
includesFile =
conf.get(YarnConfiguration.RM_NODES_INCLUDE_FILE_PATH,
YarnConfiguration.DEFAULT_RM_NODES_INCLUDE_FILE_PATH);
synchronized (hostsReader) {
Set<String> hostsList = hostsReader.getHosts();
Set<String> excludeList = hostsReader.getExcludedHosts();
untracked = !hostsList.isEmpty() &&
!hostsList.contains(hostName) && !hostsList.contains(ip) &&
!excludeList.contains(hostName) && !excludeList.contains(ip);
}
return untracked;
}
{code}
Do note that with this solution, if a user does a node refresh at least once
per node removal check interval, no nodes will ever be expunged because the
timestamp will continually be updated and never exceed the interval.
> Removing nodes from include and exclude lists will not remove them from
> decommissioned nodes list
> -------------------------------------------------------------------------------------------------
>
> Key: YARN-4311
> URL: https://issues.apache.org/jira/browse/YARN-4311
> Project: Hadoop YARN
> Issue Type: Bug
> Affects Versions: 2.6.1
> Reporter: Kuhu Shukla
> Assignee: Kuhu Shukla
> Attachments: YARN-4311-v1.patch, YARN-4311-v2.patch,
> YARN-4311-v3.patch
>
>
> In order to fully forget about a node, removing the node from include and
> exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The
> tricky part that [~jlowe] pointed out was the case when include lists are not
> used, in that case we don't want the nodes to fall off if they are not active.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)