[ 
https://issues.apache.org/jira/browse/YARN-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15604718#comment-15604718
 ] 

Varun Saxena edited comment on YARN-5773 at 10/25/16 9:09 AM:
--------------------------------------------------------------

Is there any need to activate applications on recovery ? Cluster resources will 
anyways be 0 on recovery as resource tracker service has not yet started. Maybe 
pass it in the event so that scheduler knows that recovery is happening while 
adding attempt.
We can however check for cluster resources or user limit right in the beginning 
while activating applications and come out of it if applicable resources are 0. 
That will have same impact on recovery.

Overall i.e. in normal flow, to optimize activateApplications, Wangda's 
suggestion sounds good. But ordering policy will have to be maintained as well. 
Right ?


was (Author: varun_saxena):
Is there any need to activate applications on recovery ? Cluster resources will 
anyways be 0 on recovery as resource tracker service has not yet started.
We can however check for cluster resources or user limit right in the beginning 
while activating applications and come out of it if applicable resources are 0. 
That will have same impact on recovery.

Overall i.e. in normal flow, to optimize activateApplications, Wangda's 
suggestion sounds good. But ordering policy will have to be maintained as well. 
Right ?

> RM recovery too slow due to LeafQueue#activateApplication()
> -----------------------------------------------------------
>
>                 Key: YARN-5773
>                 URL: https://issues.apache.org/jira/browse/YARN-5773
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Bibin A Chundatt
>            Assignee: Bibin A Chundatt
>            Priority: Critical
>         Attachments: YARN-5773.0001.patch, YARN-5773.0002.patch
>
>
> # Submit application 10K application to default queue.
> # All applications are in accepted state
> # Now restart resourcemanager
> For each application recovery {{LeafQueue#activateApplications()}} is 
> invoked.Resulting in AM limit check to be done even before Node managers are 
> getting registered.
> Total iteration for N application is about {{N(N+1)/2}} for {{10K}} 
> application   {{50000000}} iterations causing time take for Rm to be active 
> more than 10 min.
> Since NM resources are not yet added to during recovery we should skip 
> {{activateApplicaiton()}} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to