[
https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16490165#comment-16490165
]
Weiwei Yang commented on YARN-8320:
-----------------------------------
Hi [[email protected]]
Thanks for the comments, please see my points below
bq. 1) and 3) it might make sense to use a separate resource type for this
feature
Extend resource type might not be straightforward for cpuset. From your
suggestion, how can you define how many cpuset resource on a NM and how a AM to
request? It's not a numeric value. The problem is cpuset is working on physical
cores (processor) but Yarn manages vcores, and a processor can be shared by
multiple containers. Hence we can hardly define "values" if we consider it as a
resource.
bq. 2) users might not need the RESERVED/SHARED modes
This was my first impression too. But after I talked with [~yangjiandan] and
some other folks who manage LS services, I change my mind. RESERVE/SHARE helps
to improve the utilization and a key for mix-workload environment, that having
batch tasks running along with services. It helps to resolve the problems like
you mentioned in #6 and #7.
bq. 4) The design lets the AM do a delayed exclusive request directly to the NM
avoiding the RM. I think it would be more robust to request from the RM in the
container launch context and just forward this to the NM. The RM has the chance
to decline or delay the request in this case in the future.
I agree. We are not figuring out a way to let RM play its role here, will try
harder thinking about this.
bq. 6) Let me mention that this feature negatively affects YARN-1011 and
oversubscription.
That's why we have RESERVED/SHARED mode, it allows a LS service to share its
CPU with other tasks, including O containers (O containers will be using ANY
mode). But if we set a container with EXCLUSIVE mode, then yes, this will
occupied the CPU, this is the only way to ensure it runs completely isolated
for such highly sensitive tasks. For our existing online services, most of them
are using RESERVED or SHARE mode in order to improve the utilization (a typical
mixed-workload scenario)
bq. 5) how can you make sure a parent cgroup does not interfere with a cgroup
marked as cpuset.cpu_exclusive=1? What if a system service wakes up?
We are not going to set cpuset.cpu_exclusive=1, at least not in this version of
design. We are trying to solve the problem about competing CPU resources
between containers, not with system services.
bq. 7) Also, latency sensitive applications get exclusive protection but can
only be assigned to their cpuset disallowing bursts to other CPUs when needed.
I do not know how to solve this though.
Use SHARE mode. We have a lot of online services running under this mode, that
allows it to use all processors except those assigned to EXCLUSIVE and RESERVED.
bq. 8) mean that other container cgroups need to be changed and reduced every
time a reserved container starts
Correct. When we assign a processor to a container using RESERVED or EXCLUSIVE,
then we need to remove it from rest of containers cgroup, this is briefly
introduced in section 3.5 of the design doc.
Hope it makes sense, looking forward to hear your feedback.
Thanks
> [Umbrella] Support CPU isolation for latency-sensitive (LS) service
> -------------------------------------------------------------------
>
> Key: YARN-8320
> URL: https://issues.apache.org/jira/browse/YARN-8320
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: nodemanager
> Reporter: Jiandan Yang
> Priority: Major
> Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf,
> CPU-isolation-for-latency-sensitive-services-v2.pdf, YARN-8320.001.patch
>
>
> Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and
> “cpu.shares” to isolate cpu resource. However,
> * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler;
> no support for differentiated latency
> * Request latency of services running on container may be frequent shake
> when all containers share cpus, and latency-sensitive services can not afford
> in our production environment.
> So we need more fine-grained cpu isolation.
> Here we propose a solution using cgroup cpuset to binds containers to
> different processors, this is inspired by the isolation technique in [Borg
> system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf].
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]