[
https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272968#comment-17272968
]
Andras Gyori commented on YARN-10532:
-------------------------------------
Thank you [~zhuqi] for the patch. I have come up with several points regarding
this approach
* In my opinion, implementing auto queue deletion for the legacy auto queue
logic is not justified. Old CS users have their own way of keeping their queue
hierarchy clean, thus providing this feature would be of little use for them.
As for new CS users, they are encouraged to use the new auto queue creation. We
should encourage the userbase to move away from ManagedParents, as the code is
hard to maintain and very hard to reason about.
* I think the approach chosen for this patch is hard to maintain because:
** Does not have a central point where the dynamic queue deletion happens
(this was a major pain point of weight calculation as well, we should not
repeat this mistake again). QueueManagementChanges and updateQueues both have
twisted logic, which reduces readability.
** It does not cover all cases. If I understand correctly, the auto deletion
only triggered if CS is reinitialised or a queue management change occurs. In
my opinion, we should not rely on events of the users, which may, or may not
happen.
** It does not handle deletion of ParentQueues. I think childless ParentQueues
should get removed as well.
My idea of implementing automatic queue deletion somewhat similar to a garbage
collector:
# Run a background thread, that periodically checks the whole queue hierarchy
(maybe we could store the references of all the dynamic queues, in order to
eliminate the cost of traversing the hierarchy)
# Store the timestamp when a dynamic reaches 0 application (either in the
queue itself or in an external map)
# Mark the queues for deletion, that has been without application for a
configured time
## Marking introduces a grace period, to avoid race conditions (namely, delete
a queue in the same as an application has been submitted
## Application submission to marked queues should be rejected or make the
mapping rules step to the next rule
# After the grace period, check that the marked queues does not have any
application running, and:
## Delete, if active application number is still == 0
## Remove mark and timestamp if active application number > 0
# Remove dynamic ParentQueues the same way, but instead of checking active
applications, check the number of children
Now, I see that marking would introduce a surprising behaviour, but I can not
come up with a way that is less disruptive and solves the race condition at the
same time.
> Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is
> not being used
> --------------------------------------------------------------------------------------------
>
> Key: YARN-10532
> URL: https://issues.apache.org/jira/browse/YARN-10532
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Wangda Tan
> Assignee: zhuqi
> Priority: Major
> Attachments: YARN-10532.001.patch, YARN-10532.002.patch,
> YARN-10532.003.patch, YARN-10532.004.patch, YARN-10532.005.patch,
> YARN-10532.006.patch, YARN-10532.007.patch
>
>
> It's better if we can delete auto-created queues when they are not in use for
> a period of time (like 5 mins). It will be helpful when we have a large
> number of auto-created queues (e.g. from 500 users), but only a small subset
> of queues are actively used.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]