[
https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147663#comment-16147663
]
Daniel Templeton commented on YARN-2497:
----------------------------------------
bq. We should guarantee that a queue with non-label cannot access a node with a
label
Agreed, and the current patch does that. What I still have to figure out is
how to sensibly assign a queue labels and no label. In capacity scheduler, all
queues can access nodes with no label. I'm not sure that's the best approach.
For example, assume I have a GPU label, and I want to make sure that any app
requesting nodes with the GPU label is scheduled as a priority (because my GPU
card is an expensive resource that I want to see maximally used). I therefore
create a GPU queue and give that queue a very high weight. If that queue also
allowed apps with no label, then I could submit non-GPU jobs to that queue just
to boost my priority. On the other hand, I want to be able to submit an app
that uses no label for the AM so that I don't consume GPU resources for no
reason. I still need to ponder that one a little.
Multiple labels are explicitly not supported because of the chaos that would
create. Instead see YARN-3409.
I do not intend to tackle relaxed partitions in this patch. That's a much
trickier implementation that requires delayed scheduling.
I will be testing failover for node labels, but I don't see any reason why it
shouldn't work as is.
bq. A queue should only access one label
That doesn't work. Because an app can only be in one queue at a time, in order
for an app to use different labels for different containers, the queue must
support multiple labels. A primary use case is as I mentioned above, an AM
that doesn't want to consume a limit resource that its tasks will need. I
don't like it either, but I don't see another way around it.
> Changes for fair scheduler to support allocate resource respect labels
> ----------------------------------------------------------------------
>
> Key: YARN-2497
> URL: https://issues.apache.org/jira/browse/YARN-2497
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: fairscheduler
> Reporter: Wangda Tan
> Assignee: Daniel Templeton
> Attachments: YARN-2499.WIP01.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]