[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14729119#comment-14729119 ]
Sunil G commented on YARN-2005: ------------------------------- Hi [~adhoot] Thank you for updating the patch. I have a comment here. {{isWaitingForAMContainer}} is now used in 2 cases. To set the {{ContainerType}} and also in blacklist case. And this check is now hitting in every heartbeat from AM. I think its better to set a state called {{amIsStarted}} in {{SchedulerApplicationAttempt}}. And this can be set from 2 places. 1. {{RMAppAttemptImpl#AMContainerAllocatedTransition}} can call a new scheduler api to set {{amIsStarted}} flag when AM Container is launched and registered. We need to pass ContainerId to this new api to get attempt object and to set the flag. 2. {{AbstrctYarnScheduler#recoverContainersOnNode}} can also invoke this api to set this flag. So now we can directly read from {{SchedulerApplicationAttempt}} everytime when heartbeat call comes from AM. If we are not doing this in this ticket, I can open another ticket for this optimization. Please suggest your thoughts. > Blacklisting support for scheduling AMs > --------------------------------------- > > Key: YARN-2005 > URL: https://issues.apache.org/jira/browse/YARN-2005 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager > Affects Versions: 0.23.10, 2.4.0 > Reporter: Jason Lowe > Assignee: Anubhav Dhoot > Attachments: YARN-2005.001.patch, YARN-2005.002.patch, > YARN-2005.003.patch, YARN-2005.004.patch, YARN-2005.005.patch, > YARN-2005.006.patch, YARN-2005.006.patch, YARN-2005.007.patch, > YARN-2005.008.patch > > > It would be nice if the RM supported blacklisting a node for an AM launch > after the same node fails a configurable number of AM attempts. This would > be similar to the blacklisting support for scheduling task attempts in the > MapReduce AM but for scheduling AM attempts on the RM side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)