[
https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195488#comment-16195488
]
Yufei Gu commented on YARN-2497:
--------------------------------
Thanks for woking on this, [~templedf]. I has a pretty long and detailed
comments, but I want to discuss with you before I post that.
Node labeling inevitably causes the partition of the cluster, which impacts
scheduler a lot. There may be lower utilization, unfairness, infinity loop or
deadlock/livelock. The lower utilization and unfairness seem benign but hard to
avoid, but we definitely want to prevent the infinity loop and
deadlock/livelock. The current solution supports labeling for each queue, which
makes schedulers much more complicate, and makes support and maintenance much
harder. Base on my experience with the MaxAMShare of queue, the community
spent years and did bunches JIRAs to make it right after it is introduced into
fair scheduler. However, MaxAMShare issues still pop up and bite us from time
to time. The queue level labeling is much more complex than that in terms of
calculation of fair share, max share, max AM share. We probably don't want to
do that if we could avoid that.
Instead of queue labeling, I suggested the application level labeling. It will
be pretty similar to what we did for data locality. Data locality doesn't
affect queue management and all sorts of calculation related to queue
properties. It's all about resource requests and node properties, which is
similar to node labeling. In that sense, node labeling won't affect queue
management at all.
What do you think?
> Changes for fair scheduler to support allocate resource respect labels
> ----------------------------------------------------------------------
>
> Key: YARN-2497
> URL: https://issues.apache.org/jira/browse/YARN-2497
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: fairscheduler
> Reporter: Wangda Tan
> Assignee: Daniel Templeton
> Attachments: YARN-2497.001.patch, YARN-2497.002.patch,
> YARN-2497.003.patch, YARN-2497.004.patch, YARN-2497.005.patch,
> YARN-2497.006.patch, YARN-2497.007.patch, YARN-2497.008.patch,
> YARN-2497.009.patch, YARN-2497.010.patch, YARN-2499.WIP01.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]