[
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997290#comment-16997290
]
Wilfred Spiegelenburg commented on YARN-9879:
---------------------------------------------
I have read through the design document and was wondering if we cannot take a
far simpler approach.
If we simply relax the rule that the leaf queue must be unique in the system in
favour of the fact that a queue must be unique based on the full queue path.
This does not break existing configurations as the unique leaf queue is also
unique when you take into account the whole path. That means there is nothing
for the current clusters that needs to change. Internally the scheduler does
have to change to make sure that all references use the queue path. This will
require a lot of changes throughout the scheduler when you look up a queue and
the way we store the reference if it is not directly to the leaf queue.
The only other point that we need to correctly handle this now is on the submit
side. This must be handled backward compatible. We have two cases to handle:
just a queue name and a queue path. I'll discuss updating the configuration is
later.
# When an application is submitted with just a queue name (not a path) we
expect that the name is a unique leaf queue name. If that queue does not exist
or is not uniquely identifiable we reject the application submission.
Resolution of the real leaf queue follows the same steps as it does now. The
queue name in the end is converted to the correct leaf queue identified by the
a path. For existing configurations nothing has changed. Internally we hide all
the changes.
# When the submit has a queue path (fully qualified or not) we check that the
queue exists based on that path. If the leaf queue is not defined using its
path the application submission is rejected.
In the case that the scheduler has a non unique leaf queue name submitting to
those queues can only be done by using their paths. There is nothing that needs
to be configured to switch this behaviour on or off.
The important part is applying a new configuration. If the configuration adds a
leaf queue that is not unique the configuration update currently is rejected.
With this change we would allow that config to become active. This *could*
break existing applications when they try to submit to the leaf queue that is
no longer unique.
We should at least log and warn clearly in the response of the update. Maybe
even show it in the UI or we could ask for a confirmation. The first update
that adds a non unique queue to the configuration should always fail
complaining loudly. It should then keep warning the user and rejecting the
update unless a confirmation flag is set to force the update through. After the
first update that would not be needed anymore.
Reading a config from a file or store which is used to initialise the scheduler
should not trigger such behaviour. We still should show a warning in the logs
to make sure it is not lost.
What do you think about this approach?
> Allow multiple leaf queues with the same name in CS
> ---------------------------------------------------
>
> Key: YARN-9879
> URL: https://issues.apache.org/jira/browse/YARN-9879
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Gergely Pollak
> Assignee: Gergely Pollak
> Priority: Major
> Attachments: DesignDoc_v1.pdf
>
>
> Currently the leaf queue's name must be unique regardless of its position in
> the queue hierarchy.
> Design doc and first proposal is being made, I'll attach it as soon as it's
> done.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]