Tsuyoshi OZAWA commented on YARN-3025:

Lets say 1000 AMs pinging every 1 sec.

I expected that we only synchronize the state only when RM detect the the 
difference of blacklists before and after the heartbeat. I thought probability 
to mark nodes as blacklist is not so high. What do you think?

Yes. But that would mean that the RM cannot provide the latest updates.

I think it can be acceptable for many cases if the blacklist node are updated 
within 1 min or some minutes e.g. for admin's knowing cluster information. In 
this case, we should also document it explicitly to know the trade off of the 
sync interval.

> Provide API for retrieving blacklisted nodes
> --------------------------------------------
>                 Key: YARN-3025
>                 URL: https://issues.apache.org/jira/browse/YARN-3025
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Ted Yu
> We have the following method which updates blacklist:
> {code}
>   public synchronized void updateBlacklist(List<String> blacklistAdditions,
>       List<String> blacklistRemovals) {
> {code}
> Upon AM failover, there should be an API which returns the blacklisted nodes 
> so that the new AM can make consistent decisions.
> The new API can be:
> {code}
>   public synchronized List<String> getBlacklistedNodes()
> {code}

This message was sent by Atlassian JIRA

Reply via email to