[
https://issues.apache.org/jira/browse/YARN-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334116#comment-16334116
]
Szilard Nemeth edited comment on YARN-4022 at 1/22/18 11:27 AM:
----------------------------------------------------------------
Hey @danieltempleton, @yufei!
Could you please help me a bit with this one?
I left the _{{yarn.scheduler.fair.user-as-default-queue}}_ and
_{{yarn.scheduler.fair.allow-undeclared-pools}}_ configs on their default
values (true) as based on the FairScheduler page
([https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html]),
this makes the most sense to have dynamic queues created.
When I started a pi job, the job was assigned to a dynamically created queue,
namely "root.szilardnemeth" so I guess the above config is correct.
When I ran {{yarn rmadmin -refreshQueues}} and checked the RM Webservices API
with a GET request to "ws/v1/cluster/scheduler", the queue was still there.
After that, I debugged the calls described below and found out that the queues
are not deleted when refreshQueues is invoked even if they are empty.
Eventually,
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager#removeEmptyIncompatibleQueues}}
is invoked but this method does not delete my dynamically created leaf queue,
moreover this method does not seem to be a good fit to add queue removal
functionality, since it only deals with incompatible queues.
I found out that when I start the command {{yarn rmadmin -refreshQueues}}, the
following relevant calls are performed:
1. {{AdminService.refreshQueues}} handles the CLI command
2.
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler#reinitialize}}
is invoked
3. The method above invokes
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService#reloadAllocations}}
which loads the allocations.xml file.
At the end of this method, a call happens to the {{reloadListener}} with the
parsed configuration object: {{reloadListener.onReload(info);}}
4.
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.AllocationReloadListener#onReload}}
is invoked.
5. The method above invokes
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager#updateAllocationConfiguration}}
This method is responsible for removing incompatible queues, see
{{removeEmptyIncompatibleQueues}} in {{QueueManager}} (at the time of writing:
[https://github.com/apache/hadoop/blob/99292adcefdc6b8f280b8e100605fb39f755c38a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java#L351])
*For me, adding the queue removal functionality to FairScheduler.reinitialize
would be the most logical thing to do, as the rest of the methods are strongly
related to reading the allocations file and since dynamically created queues
are not based on that file, it is a "separate entity". *
My questions:
1. Should all empty dynamically created queues be removed when the
refreshQueues command is invoked with the CLI?
2. May all empty queues be removed when refreshQueues command is invoked or
just the dynamically created ones?
3. If the answer is "just the dynamically created queues can be removed" for
question 2, how can I differentiate the normal queues from the dynamically
created queues?
4. Does it make sense to define a new command like "purgeQueues" or something
like that?
Based on the documentation
(https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnCommands.html),
refreshQueues does not delete any queue, so maybe users would be surprised
with the result.
was (Author: snemeth):
Hey @danieltempleton, @yufei!
Could you please help me a bit with this one?
I left the _{{yarn.scheduler.fair.user-as-default-queue}}_ and
_{{yarn.scheduler.fair.allow-undeclared-pools}}_ configs on their default
values (true) as based on the FairScheduler page
([https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html]),
this makes the most sense to have dynamic queues created.
When I started a pi job, the job was assigned to a dynamically created queue,
namely "root.szilardnemeth" so I guess the above config is correct.
When I ran {{yarn rmadmin -refreshQueues}} and checked the RM Webservices API
with a GET request to "ws/v1/cluster/scheduler", the queue was still there.
After that, I debugged the calls described below and found out that the queues
are not deleted when refreshQueues is invoked even if they are empty.
Eventually,
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager#removeEmptyIncompatibleQueues}}
is invoked but this method does not delete my dynamically created leaf queue,
moreover this method does not seem to be a good fit to add queue removal
functionality, since it only deals with incompatible queues.
I found out that when I start the command {{yarn rmadmin -refreshQueues}}, the
following relevant calls are performed:
1. {{AdminService.refreshQueues}} handles the CLI command
2.
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler#reinitialize}}
is invoked
3. The method above invokes
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService#reloadAllocations}}
which loads the allocations.xml file.
At the end of this method, a call happens to the {{reloadListener}} with the
parsed configuration object: {{reloadListener.onReload(info);}}
4.
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.AllocationReloadListener#onReload}}
is invoked.
5. The method above invokes
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager#updateAllocationConfiguration}}
This method is responsible for removing incompatible queues, see
{{removeEmptyIncompatibleQueues}} in {{QueueManager}} (at the time of writing:
[https://github.com/apache/hadoop/blob/99292adcefdc6b8f280b8e100605fb39f755c38a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java#L351])
*For me, adding the queue removal functionality to FairScheduler.reinitialize
would be the most logical thing to do, as the rest of the methods are strongly
related to reading the allocations file and since dynamically created queues
are not based on that file, it is a "separate entity". *
My questions:
1. Should all empty dynamically created queues be removed when the
refreshQueues command is invoked with the CLI?
2. May all empty queues be removed when refreshQueues command is invoked or
just the dynamically created ones?
3. If the answer is "just the dynamically created queues can be removed" for
question 2, how can I differentiate the normal queues from the dynamically
created queues?
> queue not remove from webpage(/cluster/scheduler) when delete queue in
> xxx-scheduler.xml
> ----------------------------------------------------------------------------------------
>
> Key: YARN-4022
> URL: https://issues.apache.org/jira/browse/YARN-4022
> Project: Hadoop YARN
> Issue Type: Bug
> Components: fairscheduler, resourcemanager
> Affects Versions: 2.7.1
> Reporter: forrestchen
> Assignee: Szilard Nemeth
> Priority: Major
> Labels: oct16-medium, scheduler
> Attachments: YARN-4022.001.patch, YARN-4022.002.patch,
> YARN-4022.003.patch, YARN-4022.004.patch
>
>
> When I delete an existing queue by modify the xxx-schedule.xml, I can still
> see the queue information block in webpage(/cluster/scheduler) though the
> 'Min Resources' items all become to zero and have no item of 'Max Running
> Applications'.
> I can still submit an application to the deleted queue and the application
> will run using 'root.default' queue instead, but submit to an un-exist queue
> will cause an exception.
> My expectation is the deleted queue will not displayed in webpage and submit
> application to the deleted queue will act just like the queue doesn't exist.
> PS: There's no application running in the queue I delete.
> Some related config in yarn-site.xml:
> {code}
> <property>
> <name>yarn.scheduler.fair.user-as-default-queue</name>
> <value>false</value>
> </property>
> <property>
> <name>yarn.scheduler.fair.allow-undeclared-pools</name>
> <value>false</value>
> </property>
> {code}
> a related question is here:
> http://stackoverflow.com/questions/26488564/hadoop-yarn-why-the-queue-cannot-be-deleted-after-i-revise-my-fair-scheduler-xm
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]