[ 
https://issues.apache.org/jira/browse/YARN-9202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16745549#comment-16745549
 ] 

Kuhu Shukla commented on YARN-9202:
-----------------------------------

Thank you Jim for the review. Appreciate it.
bq.  If nodes are in the include list, but never register, what is it that we 
are missing
Currently there is no way to know which nodes should have been a part of the 
cluster, unless one manually goes and checks the include list. This is 
different from the Namenode as the nodes that are not registered are still 
listed as dead or in other categories.
bq. Is it just that those nodes are not included in any metrics? 
More or less, yes, tracking what *should* be there is harder for operation 
teams.
bq. Can the desired result be accomplished by just adding these nodes to the 
inactive list and leaving them in the NEW state? 
I did think about that and since there was no place where NEW nodes were 
exposed on the UI I thought may be moving them to a somewhat terminal state 
would be nicer , but of course, I like the idea of having NEW nodes in the 
inactive list as well. I will have to see how much semantic difference does it 
make in the code, to which end I will update shortly.
bq. testIncludeHostsWithNoRegister() - it's not clear to me why the latter half 
of the test is needed?  Looks like it was copied from the previous test but I 
don't see why it needs to be repeated in this one?
True. I will prune the test in the next version.

If keeping the nodes in NEW state is fairly straight forward while they get 
listed as inactive, the next version would have that change as well. 



> RM does not track nodes that are in the include list and never register
> -----------------------------------------------------------------------
>
>                 Key: YARN-9202
>                 URL: https://issues.apache.org/jira/browse/YARN-9202
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.9.2, 3.0.3, 2.8.5
>            Reporter: Kuhu Shukla
>            Assignee: Kuhu Shukla
>            Priority: Major
>         Attachments: YARN-9202.001.patch
>
>
> The RM state machine decides to put new or running nodes in inactive state 
> only past the point of either registration or being in the exclude list. This 
> does not cover the case where a node is the in the include list but never 
> registers and since all state changes are based on these NodeState 
> transitions, having NEW nodes be listed as inactive first may help. This 
> would change the semantics of how inactiveNodes are looked at today. Another 
> state addition might help this case too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to