[
https://issues.apache.org/jira/browse/YARN-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802176#comment-15802176
]
Xuan Gong commented on YARN-5556:
---------------------------------
[~Naganarasimha] Thanks for the comments
[~leftnoteasy] Please comment if you have any further suggestions.
bq. So user needs to delete a queue(say a2) then he needs to remove the queue
from its parent's "yarn.scheduler.capacity.<parent queue>.queues" config and
also mention its state(yarn.scheduler.capacity.<root...a2>.state) as DELETED
right ?
Do not need to remove the queue from its parent's
"yarn.scheduler.capacity.<parent queue>.queues" config, just mention its
state(yarn.scheduler.capacity.<root...a2>.state) as DELETED.
bq. How to delete intermediate queues? i presume we need NOT configure state
for each of its children right ? or do we plan to support delete of only leaf
queue?
We need not configure the state for each of its children. Just mark delete for
the queue itself.
bq. Do we need to consider the moving of queues(along with its apps) from one
queue hiearchy to another ? IMO it complicates but not sure about the real
world usecases.
we can consider this scenario later.
bq. In case of HA, i think it further complicates as if both the RM's are
initialiased with old queue settings and then if new queue is updated then CS
is aware of deleted queue else if the RM starts of with updated xml(with
deleted queue) then deleted queue information is not available and if failover
happens to this RM then apps running on the deleted queue cannot be recovered
as the queue doesnt exist. so do we need to start maintaining the deleted queue
in statestore or need handling of creating queue objects for the queues whose
state has been marked as deleted (then we need to consider 2nd point) ?
Yes, this is the fundamental issue with the "configuration-based" approach.
This api-based approach would solve this issue:
https://issues.apache.org/jira/browse/YARN-5734. But for "configuration-based"
approach, in RM HA case, we have to make sure the configuration file for every
RM nodes is updated.
bq. do we need to consider showing of the deleted queues in the webui ? may be
in another jira but the code needs to be updated.
Yes, we could file a separate jira, and do it later.
The basic workflow could be: before we can actually delete the queue, we should
make sure the queue in STOPPED state which means this queue can not accept any
new applications, and all apps (including pending request) have been finished
(for now, we could simply wait. or add a command/flag to force kill later).
Then, we could delete the queue and split capacity.
Thanks
Xuan Gong
> Support for deleting queues without requiring a RM restart
> ----------------------------------------------------------
>
> Key: YARN-5556
> URL: https://issues.apache.org/jira/browse/YARN-5556
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: yarn
> Reporter: Xuan Gong
> Assignee: Naganarasimha G R
> Attachments: YARN-5556.v1.001.patch, YARN-5556.v1.002.patch,
> YARN-5556.v1.003.patch, YARN-5556.v1.004.patch
>
>
> Today, we could add or modify queues without restarting the RM, via a CS
> refresh. But for deleting queue, we have to restart the ResourceManager. We
> could support for deleting queues without requiring a RM restart
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]