[
https://issues.apache.org/jira/browse/YARN-5764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15806024#comment-15806024
]
Devaraj K commented on YARN-5764:
---------------------------------
Thanks [~rohithsharma] for going through this.
bq. NUMA resources is scheduled by by NodeManager. Why can't RM make the
decision of scheduling NUMA resources using resource profilers.?
With NUMA, memory blocks and processors in a single machine divided into numa
nodes, and processors in the numa node can access the memory faster which is
local to it. If we want to make RM to schedule this information, each NM has to
send the numa nodes information(i.e. List{(numanode-id, processors, memory),..}
to RM and RM has to maintain this information including the usage details for
scheduling. At present RM already does the scheduling of NM memory and vcores
as a whole and I think it is cumbersome to move numa nodes scheduling which is
granular level scheduling to RM.
bq. Could you elaborate, why there are multiple numa-awareness.node-ids in
single machine?
In Non-Uniform Memory Access model(NUMA), memory blocks and processors in a
single machine divided into multiple numa nodes, and each numa node has an id
assigned to it. When the user/application want to make use of the numa
resources, then the process should be bind to those numa node-ids.
> NUMA awareness support for launching containers
> -----------------------------------------------
>
> Key: YARN-5764
> URL: https://issues.apache.org/jira/browse/YARN-5764
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: nodemanager, yarn
> Reporter: Olasoji
> Assignee: Devaraj K
> Attachments: NUMA Awareness for YARN Containers.pdf,
> YARN-5764-v0.patch, YARN-5764-v1.patch
>
>
> The purpose of this feature is to improve Hadoop performance by minimizing
> costly remote memory accesses on non SMP systems. Yarn containers, on launch,
> will be pinned to a specific NUMA node and all subsequent memory allocations
> will be served by the same node, reducing remote memory accesses. The current
> default behavior is to spread memory across all NUMA nodes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]