[ 
https://issues.apache.org/jira/browse/YARN-11021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Tsalolikhin updated YARN-11021:
---------------------------------------
    Summary: Define Hadoop YARN term "vcore"  (was: Define Hadoop YARN "vcore")

> Define Hadoop YARN term "vcore"
> -------------------------------
>
>                 Key: YARN-11021
>                 URL: https://issues.apache.org/jira/browse/YARN-11021
>             Project: Hadoop YARN
>          Issue Type: Wish
>          Components: docs, documentation
>    Affects Versions: 3.3.1
>            Reporter: Aleksey Tsalolikhin
>            Priority: Major
>
> Hello,
> This is a request to define the Hadoop YARN term "vCore".  It's clearly 
> different than vCPU as in the number of virtual CPUs (or CPU cores) a system 
> has as per /proc/cpuinfo. What is a YARN vcore, please?
> {*}Background{*}: I am running Hadoop YARN on 24 AWS EC2 instances from the 
> R5 family (memory-intensive) with the instance size of 24 XLarge (96 vCPUs 
> and 768 GB RAM each), plus the cluster master.
> I've launched a Spark application with the following spark-submit parameters:
> {{    --executor-memory 224G}}
> {{    --conf spark.executor.memoryOverhead=23901M}}
> {{    --executor-cores 32}}
> That sets a ratio of about 250 GB of RAM (combined) to 32 vCPUs per executor; 
> I have Spark dynamic resource allocation enabled, so I expect to see three 
> executors per instance, and that's how it turns out.
> 24 nodes x 3 executors per node = 72 executors
> Plus the Application Master running on the Master node makes 73 executors.
> This matches the "73 allocated" I see in "yarn top" output in the 
> "Containers" line:
> {{    YARN top - 11:03:57, up 0d, 18:9, 1 active users, queue(s): root}}
> {{    NodeManager(s): 24 total, 24 active, 0 unhealthy, 44 decommissioned, 0 
> lost, 0 rebooted}}
> {{    Queue(s) Applications: 1 running, 1 submitted, 0 pending, 0 completed, 
> 0 killed, 0 failed}}
> {{    Queue(s) Mem(GB): 183 available, 17809 allocated, 69008 pending, 247 
> reserved}}
> {{    Queue(s) VCores: 2230 available, 73 allocated, 279 pending, 1 reserved}}
> {{    Queue(s) Containers: 73 allocated, 279 pending, 1 reserved}}
> Most of the memory is allocated, which is as expected.
> But why does the "Queue(s) VCores" line say "73 allocated"?
> Looks like 1 VCore = 32 vCPUs?
> I looked in /etc/hadoop/conf/yarn-site.xml on one of the 24XL task
> instances with 96 vCPUs to double check how many virtual CPUs YARN thinks
> the node has, and it is 96 as expected:
> {{  <property>}}
> {{    <name>yarn.nodemanager.resource.cpu-vcores</name>}}
> {{    <value>96</value>}}
> {{  </property>}}
> I looked through all the Hadoop YARN documentation linked from 
> https://hadoop.apache.org/docs/stable/index.html looking for a definition of 
> a Hadoop YARN vCore and I couldn't find one.
> https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>  uses "virtual cores" and "computation based resource" when talking about 
> vCores.
> What is a Hadoop YARN vCore?  How does it relate to virtual CPUs I see in 
> e.g., /proc/cpuinfo on Linux?
> There are many mentions of "vcore" in Hadoop YARN documentation; could we 
> please add a definition of this term?
> Thanks,
> Aleksey



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to