[
https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494656#comment-16494656
]
Miklos Szegedi commented on YARN-8320:
--------------------------------------
Thank you [~cheersyang] for the detailed response. The only thing that you are
missing I think is that cpu and cpuset are not the same resource in cgroups.
They are actually two dimensions of the CPU space. cpu,cpuacct controls in
general how much time is allocated (one dimension) and cpuset controls how many
physical devices are allocated (second dimension). cpu,cpuacct is a
compressible, flexible resource more will almost always proportionally reduce
the runtime if cpu bound. cpuset is is not flexible, it depends on the thread
factor of the container.
Just to use your example above:
{code:java}
I have a NM with capacity:
memory: 10gb
vcore: 10
cpus: 10 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
Request with just vcore number (the container runs a single process and single
thread !):
memory: 1gb
vcore: 5
After allocation, my NM capacity updates to
memory: 9gb
vcore: 5
cpus: 5 (0, 1, 2, 3, 4) WRONG(!) The process is single threaded, 4 cores are
wasted.
Request with both vcore number and cpus:
memory: 1gb
vcore(cputime): 5
cpuset: 1
After allocation, my NM capacity updates to
memory: 9gb
vcore(cputime): 5
cpus: 5 (0, 1, 2, 3, 4, 5, 6, 7, 8) GOOD The process is single threaded.
{code}
I understand that you would like to simplify the configuration. However, as you
see in the example above the situation above will never be able to be solved by
YARN anymore. This because of backward compatibility, if the current design is
chosen.
That being said, if you still would like to follow the simplified path, please
go ahead, I just wanted to elaborate my concerns.
> [Umbrella] Support CPU isolation for latency-sensitive (LS) service
> -------------------------------------------------------------------
>
> Key: YARN-8320
> URL: https://issues.apache.org/jira/browse/YARN-8320
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: nodemanager
> Reporter: Jiandan Yang
> Priority: Major
> Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf,
> CPU-isolation-for-latency-sensitive-services-v2.pdf, YARN-8320.001.patch
>
>
> Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and
> “cpu.shares” to isolate cpu resource. However,
> * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler;
> no support for differentiated latency
> * Request latency of services running on container may be frequent shake
> when all containers share cpus, and latency-sensitive services can not afford
> in our production environment.
> So we need more fine-grained cpu isolation.
> Here we propose a solution using cgroup cpuset to binds containers to
> different processors, this is inspired by the isolation technique in [Borg
> system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf].
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]