[
https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488899#comment-16488899
]
Weiwei Yang commented on YARN-8320:
-----------------------------------
Hi [~leftnoteasy]
Thanks for reviewing the design doc, very good comments. Please see my answers
below
bq. the #vcore must be divisible by #physical-core, otherwise it will cause
rounding issue and containers will get less/more than requested resources
This is correct, we do need to detect #processors and do the check when this
feature is enabled. Even this satisfied on NM, we still need the calculation
introduced in section 3.3, there is a formula to transform #vcore a container
requested to #processor, this could still be a decimal. So we take the floor
integer. Container will still get what they requested (in terms of vcore), but
when it binds to processors, it is possible for example a container might get
30% of processors instead of 33% vcore of a node. Under configuration: #vcore =
10 * #processor. I think we should educate people how to set these stuff up in
a tutorial to avoid such non-optimal situation happens.
bq. I'm still trying to understand benefit of RESERVED / SHARED mode
*Same*:
Both RESERVED and SHARE modes provide an option for LS (latency-sensitive)
service to share some of its cpu with LT (latency-tolerance) tasks, aka offline
tasks. If cpu gets busy, LS service can still get enough cpu time by its
*cpu.share* (LS service will have much bigger cpu.share than LT tasks). So your
comment "RESERVED can be affected by adhoc ANY container", that is correct,
but since they have different cpu.share values, it is still under control : ).
*Difference*:
SHARE mode gives the option for a LS task share cpu with other LS tasks, this
will help the case that some of less latency-sensitive tasks (less than
EXCLUSIVE and RESERVED) can run together to pull up the cpu utilization. But
such LS tasks still want to have higher weight than offline tasks (again by
using of cpu.share).
bq. Related to NUMA allocation on YARN YARN-5764
Thanks for pointing this out, I was not aware. Just looked into the design and
implementation, I agree that that feature is related to this one, but I don't
think they are highly coupled. By roughly reading the code, I think it is
possible to: 1st phase, we only support one over the other, user can either
enable NUMA or cpuset, not both; 2nd phase, we can make them work together (by
making sure we bind processors under a certain node that loads from NUMA side
result). Will update the design doc later with some more details.
bq. Related to GPU allocation/resource plugin
We are not going to think cpuset as any sort of resource, so this is not
applicable. Most of work for this feature will be contained on NM side as a
cgroups cpuset handler. This is more like the NUMA code.
bq. To me only privileged users and applications can request non-ANY CPU mode
Valid suggestion. We can add this to phase-2 work with a pre-check. Will update
the design doc also.
Thanks
> Support CPU isolation for latency-sensitive (LS) service
> --------------------------------------------------------
>
> Key: YARN-8320
> URL: https://issues.apache.org/jira/browse/YARN-8320
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: nodemanager
> Reporter: Jiandan Yang
> Priority: Major
> Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf,
> CPU-isolation-for-latency-sensitive-services-v2.pdf, YARN-8320.001.patch
>
>
> Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and
> “cpu.shares” to isolate cpu resource. However,
> * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler;
> no support for differentiated latency
> * Request latency of services running on container may be frequent shake
> when all containers share cpus, and latency-sensitive services can not afford
> in our production environment.
> So we need more fine-grained cpu isolation.
> Here we propose a solution using cgroup cpuset to binds containers to
> different processors, this is inspired by the isolation technique in [Borg
> system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf].
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]