[
https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16454314#comment-16454314
]
Wilfred Spiegelenburg commented on YARN-8191:
---------------------------------------------
Thanks [~grepas] for working on this, it is good to remove the unused queues.
As you reference in the description this looks a lot like YARN-4022.
That jira has a design doc that Yufei and I worked on a while ago. The patch
seems to follow that design. Can we copy that same design doc here and close
YARN-4022 as a duplicate or as included in this one? YARN-4022 has an old
approach based on a thread which the new design does not need.
I am not sure yet if all goals are covered as per the document.
Back the implementation:
* The queueMgr can never be null, it is created when the FS get created the
null check is not needed in the AllocationFileLoaderService
* You are adding a new method to the FS (hasApplicationAssignedToQueue) to
check if an application is submitted to the queue but the application attempt
has not been created yet. If we have a large number of queues and applications
this count loop could become really expensive. I would suggest that we track
submitted applications in the FSLeafQueue. We can then easily redefine empty as
all 3 types (submit/run/non-run) being 0.
** In addApplication when we update the metrics add it to the a list to track
submitted apps.
** In addApplicationAttempt again when we update the metrics we remove it from
the list.
* The root and root.default queue should not be marked dynamic when they get
created by the queue manager.
* I do not understand the change around the isEmpty / shutDownIfEmpty some code
is commented out and shutDownIfEmpty is never used
* I think we should redefine isEmpty() in the QueueManager to take into account
the submitted apps instead of creating something new.
* getDynamicQueueNames should rely on the isDynamic flag that is part of the
queue and not calculated.
> Fair scheduler: queue deletion without RM restart
> -------------------------------------------------
>
> Key: YARN-8191
> URL: https://issues.apache.org/jira/browse/YARN-8191
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: fairscheduler
> Affects Versions: 3.0.1
> Reporter: Gergo Repas
> Assignee: Gergo Repas
> Priority: Major
> Attachments: YARN-8191.000.patch, YARN-8191.001.patch,
> YARN-8191.002.patch, YARN-8191.003.patch
>
>
> The Fair Scheduler never cleans up queues even if they are deleted in the
> allocation file, or were dynamically created and are never going to be used
> again. Queues always remain in memory which leads to two following issues.
> # Steady fairshares aren’t calculated correctly due to remaining queues
> # WebUI shows deleted queues, which is confusing for users (YARN-4022).
> We want to support proper queue deletion without restarting the Resource
> Manager:
> # Static queues without any entries that are removed from fair-scheduler.xml
> should be deleted from memory.
> # Dynamic queues without any entries should be deleted.
> # RM Web UI should only show the queues defined in the scheduler at that
> point in time.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]