[ 
https://issues.apache.org/jira/browse/YARN-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16288758#comment-16288758
 ] 

Sunil G commented on YARN-7643:
-------------------------------

Thanks [~suma.shivaprasad]. Some comments here.
#
{code}
  void replaceQueueFromPlacementContext(
      ApplicationPlacementContext placementContext,
      ApplicationSubmissionContext context) {
    // Set it to ApplicationSubmissionContext
    //apply queue mapping only to new application submissions
    if (placementContext != null && !StringUtils.equalsIgnoreCase(
        context.getQueue(), placementContext.getQueue())) {
      LOG.info("Placed application=" + context.getApplicationId() +
          " to queue=" + placementContext.getQueue() + ", original queue="
          + context
          .getQueue());
      context.setQueue(placementContext.getQueue());
    }
  }
{code}
Queue after placement is already updated in submission context during 
application submission. So while recovery, we already have the mapped queue 
name. Hence {{UserGroupMappingPlacementRule.getPlacementForApp}} will have 
correct mapped queue name, but still we redo same action. Ideally the current 
issue has happened because below event has to be fired from RMAppImpl to 
Scheduler and *placementContext* will be null in current case of recovery (this 
might break for normal user-mapping also?).
{code}
      app.scheduler.handle(
          new AppAddedSchedulerEvent(app.user, app.submissionContext, true,
              app.applicationPriority, app.placementContext));
{code}
Couple of suggestions:
1. Could we save *placementContext* under app data in statestore?
2. While recomputing *placeApplication*, could we bypass some api calls from 
{{PlacementManager}} as we already have the mapped queue name?

# Could we optimize {{addApplicationOnRecovery}} in CS further? Multiple if 
checks are a bit confusing. May be we can create {{getQueueWithMappings}} and 
instead of calling getQueue from addApplication/OnRecovery, we can getQueue and 
do mapping if needed. A bit if refactoring only.


> Handle recovery of applications on auto-created leaf queues
> -----------------------------------------------------------
>
>                 Key: YARN-7643
>                 URL: https://issues.apache.org/jira/browse/YARN-7643
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler
>            Reporter: Suma Shivaprasad
>            Assignee: Suma Shivaprasad
>         Attachments: YARN-7643.1.patch, YARN-7643.2.patch
>
>
> CapacityScheduler application recovery should auto-create leaf queue if it 
> doesnt exist. Also RMAppManager needs to set the queue-mapping placement 
> context so that scheduler has necessary placement context to recreate the 
> queue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to