[
https://issues.apache.org/jira/browse/YARN-7107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151833#comment-16151833
]
Daniel Templeton commented on YARN-7107:
----------------------------------------
Thanks for the patch. I love that you started with unit tests. We got off on
a tangent about override queues (my fault) and never got around to discussing
how to implement disabling a queue. To my understanding, disabling a queue
should still allow applications to be submitted to the queue, but it should
prevent them from running, even if the cluster is otherwise idle.
The patch you've posted almost does that, except that the jobs in disabled
queues will run if there's nothing else in the cluster that wants resources.
Instead of turning down the min share and weight (which will also cause
reporting to look odd), I'd add a new property to {{FSQueue}} called
{{disabled}}. In {{FSQueue.assignContainerPreCheck()}}, I'd immediately return
false is {{disabled}} is {{true}}. In {{FSQueue.getQueueInfo()}}, I'd also add
{{disabled}} to the reported state. As a performance optimization, I'd also
modify {{FSParentQueue}} to keep a disabled queues in a list separate from
{{childQueues}}, and then in {{FSParentQueue.getQueueInfo()}}, I'd iterate
through both {{childQueues}} and the list of disabled queues. I think that's
all that would be needed in order to completely shutdown execution from the
disabled queues. (Actually, the {{disabled}} is probably overkill, but it's
nice to have it for reporting.)
What do you think?
> add ability in Fair Scheduler to configure whether disable a queue
> ------------------------------------------------------------------
>
> Key: YARN-7107
> URL: https://issues.apache.org/jira/browse/YARN-7107
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: YunFan Zhou
> Assignee: YunFan Zhou
> Labels: fairscheduler
> Attachments: YARN-7107.001.preview.patch
>
>
> In a production environment, emergency situations (such as the need to
> calculate the important reports) as soon as possible we need to disable all
> other queues, only allows the *RM* 's resources assigned to emergency queue
> and other queue only at the end of the urgent tasks before allowing them to
> be scheduled properly.
> At present, our approach is to write a script, in the case of an emergency
> manual changes all other queues' *minResources *and *maxResources * to *0mb,
> 0vcores* and then rebase it.This is very troublesome and easy to make
> mistakes.
> So we need to add a configuration in the *FairScheduler* configuration to
> indicate whether the queue is disabled, and if it is disabled, then *RM *will
> not allocate resources to the queue.
> * The child queue will integrate this property of the parent queue.
> * If the child queue is configured with this property, the value of the child
> queue configuration overrides the attributes of the parent queue.
> * The default value of the root queue is *enabled*.
> This will satisfy our needs, and I think other users will encounter such a
> scenario.I think this is very applicable to everyone.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]