[ https://issues.apache.org/jira/browse/YARN-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16454314#comment-16454314 ]
Wilfred Spiegelenburg commented on YARN-8191: --------------------------------------------- Thanks [~grepas] for working on this, it is good to remove the unused queues. As you reference in the description this looks a lot like YARN-4022. That jira has a design doc that Yufei and I worked on a while ago. The patch seems to follow that design. Can we copy that same design doc here and close YARN-4022 as a duplicate or as included in this one? YARN-4022 has an old approach based on a thread which the new design does not need. I am not sure yet if all goals are covered as per the document. Back the implementation: * The queueMgr can never be null, it is created when the FS get created the null check is not needed in the AllocationFileLoaderService * You are adding a new method to the FS (hasApplicationAssignedToQueue) to check if an application is submitted to the queue but the application attempt has not been created yet. If we have a large number of queues and applications this count loop could become really expensive. I would suggest that we track submitted applications in the FSLeafQueue. We can then easily redefine empty as all 3 types (submit/run/non-run) being 0. ** In addApplication when we update the metrics add it to the a list to track submitted apps. ** In addApplicationAttempt again when we update the metrics we remove it from the list. * The root and root.default queue should not be marked dynamic when they get created by the queue manager. * I do not understand the change around the isEmpty / shutDownIfEmpty some code is commented out and shutDownIfEmpty is never used * I think we should redefine isEmpty() in the QueueManager to take into account the submitted apps instead of creating something new. * getDynamicQueueNames should rely on the isDynamic flag that is part of the queue and not calculated. > Fair scheduler: queue deletion without RM restart > ------------------------------------------------- > > Key: YARN-8191 > URL: https://issues.apache.org/jira/browse/YARN-8191 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler > Affects Versions: 3.0.1 > Reporter: Gergo Repas > Assignee: Gergo Repas > Priority: Major > Attachments: YARN-8191.000.patch, YARN-8191.001.patch, > YARN-8191.002.patch, YARN-8191.003.patch > > > The Fair Scheduler never cleans up queues even if they are deleted in the > allocation file, or were dynamically created and are never going to be used > again. Queues always remain in memory which leads to two following issues. > # Steady fairshares aren’t calculated correctly due to remaining queues > # WebUI shows deleted queues, which is confusing for users (YARN-4022). > We want to support proper queue deletion without restarting the Resource > Manager: > # Static queues without any entries that are removed from fair-scheduler.xml > should be deleted from memory. > # Dynamic queues without any entries should be deleted. > # RM Web UI should only show the queues defined in the scheduler at that > point in time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org