[ 
https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14729119#comment-14729119
 ] 

Sunil G commented on YARN-2005:
-------------------------------

Hi  [~adhoot]
Thank you for updating  the patch. I have a comment here.

{{isWaitingForAMContainer}} is now used in 2 cases. To set the 
{{ContainerType}} and also in blacklist case. And this check is now hitting in 
every heartbeat from AM.

I think its better to set a state called {{amIsStarted}} in 
{{SchedulerApplicationAttempt}}. And this can be set from 2 places.
1. {{RMAppAttemptImpl#AMContainerAllocatedTransition}} can call a new scheduler 
api to set  {{amIsStarted}} flag when AM Container is launched and registered. 
We need to pass ContainerId to this new api to get attempt object and to set 
the flag.
2. {{AbstrctYarnScheduler#recoverContainersOnNode}} can also invoke this api  
to set this flag.

So now we can directly read from  {{SchedulerApplicationAttempt}} everytime 
when heartbeat call comes from AM. If we are not doing this in this ticket, I 
can open another ticket for this optimization. Please suggest your thoughts.

> Blacklisting support for scheduling AMs
> ---------------------------------------
>
>                 Key: YARN-2005
>                 URL: https://issues.apache.org/jira/browse/YARN-2005
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 0.23.10, 2.4.0
>            Reporter: Jason Lowe
>            Assignee: Anubhav Dhoot
>         Attachments: YARN-2005.001.patch, YARN-2005.002.patch, 
> YARN-2005.003.patch, YARN-2005.004.patch, YARN-2005.005.patch, 
> YARN-2005.006.patch, YARN-2005.006.patch, YARN-2005.007.patch, 
> YARN-2005.008.patch
>
>
> It would be nice if the RM supported blacklisting a node for an AM launch 
> after the same node fails a configurable number of AM attempts.  This would 
> be similar to the blacklisting support for scheduling task attempts in the 
> MapReduce AM but for scheduling AM attempts on the RM side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to