[ 
https://issues.apache.org/jira/browse/YARN-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15397352#comment-15397352
 ] 

Jun Gong commented on YARN-5333:
--------------------------------

Sorry for late reply. Thanks [~rohithsharma], [~sunilg] and [~jianhe]'s 
suggestion.

{quote}
IIRC, starting RMWebApp, clientRMService instance being injected. If we do not 
initialize activeServices in standby then RMWebApp start up fails. This need 
dig more.
{quote}
Do you mean {{ResourceManager#startWepApp}} will fail to start? It seems it 
starts at the beginning of RM start. How could I verify it? Thanks!

I think it works to skip RM state check for the transitionToActive case. I 
could not figure out that why we need check RM status, in my opinion RM will 
not execute these refresh functions if it is in active state. Could you please 
explain it more?

bq.I prefer doing initialization of services before starting it. Then we don't 
need to init the services when transitioning to standby, also no need to call 
refreshAll. 
Just another thought: If the time spent by {{reinitialize}} does not matter a 
lot, how about adding initialization of services in the two places(at the 
beginning of transitionToActive and at the end of transtionToStandby)?

> Some recovered apps are put into default queue when RM HA
> ---------------------------------------------------------
>
>                 Key: YARN-5333
>                 URL: https://issues.apache.org/jira/browse/YARN-5333
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Jun Gong
>            Assignee: Jun Gong
>         Attachments: YARN-5333.01.patch, YARN-5333.02.patch, 
> YARN-5333.03.patch
>
>
> Enable RM HA and use FairScheduler, 
> {{yarn.scheduler.fair.allow-undeclared-pools}} is set to false, 
> {{yarn.scheduler.fair.user-as-default-queue}} is set to false.
> Reproduce steps:
> 1. Start two RMs.
> 2. After RMs are running, change both RM's file 
> {{etc/hadoop/fair-scheduler.xml}}, then add some queues.
> 3. Submit some apps to the new added queues.
> 4. Stop the active RM, then the standby RM will transit to active and recover 
> apps.
> However the new active RM will put recovered apps into default queue because 
> it might have not loaded the new {{fair-scheduler.xml}}. We need call 
> {{initScheduler}} before start active services or bring {{refreshAll()}} in 
> front of {{rm.transitionToActive()}}. *It seems it is also important for 
> other scheduler*.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to