[ 
https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097218#comment-15097218
 ] 

Jason Lowe commented on YARN-4311:
----------------------------------

Thanks for updating the patch!

I'd like to see the getTimestamp/setTimestamp methods be a bit more specific, 
like getInactiveTimestamp or getUntrackedTimestamp or something similar.  
Currently it sounds like a generic timestamp and therefore is unclear what the 
context is and when it should and should not be used.

Similarly I think the property names are a bit too vague.  Would like to see 
"inactive", "untracked" or some other adjective in the name to qualify when the 
properties apply.

Nit: interval and timeout sound similar without qualification, e.g.: there's 
already expiry-interval properties that are really timeouts.  Should interval 
be check-interval or something to distinguish it?  And would it make sense to 
derive one from the other, similar to what is done for AM/NM expiry interval?  
(e.g: interval could be a fraction of timeout, potentially with a maximum 
interval cap of 10 minutes or something.)  That's another way to eliminate the 
potential for confusion or misconfig, but it does make it a little more 
inflexible.

Nit: There should be whitespace between the properties for readability and 
consistency with the rest of the file.  Typically the property name and default 
are clustered together with gaps between different properties.

Why does isUntrackedNode do a conf lookup and assign includesFile?  It seems 
unrelated to what it's doing, and if it is related, why not also excludesFile?  
This gets called in a loop, potentially over many nodes.  conf lookups can be a 
bit pricey, and we should avoid doing them if they're not needed.

Time.monotonicNow should be used for the timestamps rather than 
System.currentTimeMillis to avoid problems when the system time is adjusted.

The call to Timer.purge after Timer.cancel is redundant.  There are no tasks on 
the timer after Timer.cancel is called.

> Removing nodes from include and exclude lists will not remove them from 
> decommissioned nodes list
> -------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4311
>                 URL: https://issues.apache.org/jira/browse/YARN-4311
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.6.1
>            Reporter: Kuhu Shukla
>            Assignee: Kuhu Shukla
>         Attachments: YARN-4311-v1.patch, YARN-4311-v2.patch, 
> YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch, 
> YARN-4311-v6.patch, YARN-4311-v7.patch
>
>
> In order to fully forget about a node, removing the node from include and 
> exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The 
> tricky part that [~jlowe] pointed out was the case when include lists are not 
> used, in that case we don't want the nodes to fall off if they are not active.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to