[
https://issues.apache.org/jira/browse/YARN-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16288758#comment-16288758
]
Sunil G commented on YARN-7643:
-------------------------------
Thanks [~suma.shivaprasad]. Some comments here.
#
{code}
void replaceQueueFromPlacementContext(
ApplicationPlacementContext placementContext,
ApplicationSubmissionContext context) {
// Set it to ApplicationSubmissionContext
//apply queue mapping only to new application submissions
if (placementContext != null && !StringUtils.equalsIgnoreCase(
context.getQueue(), placementContext.getQueue())) {
LOG.info("Placed application=" + context.getApplicationId() +
" to queue=" + placementContext.getQueue() + ", original queue="
+ context
.getQueue());
context.setQueue(placementContext.getQueue());
}
}
{code}
Queue after placement is already updated in submission context during
application submission. So while recovery, we already have the mapped queue
name. Hence {{UserGroupMappingPlacementRule.getPlacementForApp}} will have
correct mapped queue name, but still we redo same action. Ideally the current
issue has happened because below event has to be fired from RMAppImpl to
Scheduler and *placementContext* will be null in current case of recovery (this
might break for normal user-mapping also?).
{code}
app.scheduler.handle(
new AppAddedSchedulerEvent(app.user, app.submissionContext, true,
app.applicationPriority, app.placementContext));
{code}
Couple of suggestions:
1. Could we save *placementContext* under app data in statestore?
2. While recomputing *placeApplication*, could we bypass some api calls from
{{PlacementManager}} as we already have the mapped queue name?
# Could we optimize {{addApplicationOnRecovery}} in CS further? Multiple if
checks are a bit confusing. May be we can create {{getQueueWithMappings}} and
instead of calling getQueue from addApplication/OnRecovery, we can getQueue and
do mapping if needed. A bit if refactoring only.
> Handle recovery of applications on auto-created leaf queues
> -----------------------------------------------------------
>
> Key: YARN-7643
> URL: https://issues.apache.org/jira/browse/YARN-7643
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: capacity scheduler
> Reporter: Suma Shivaprasad
> Assignee: Suma Shivaprasad
> Attachments: YARN-7643.1.patch, YARN-7643.2.patch
>
>
> CapacityScheduler application recovery should auto-create leaf queue if it
> doesnt exist. Also RMAppManager needs to set the queue-mapping placement
> context so that scheduler has necessary placement context to recreate the
> queue
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]