[
https://issues.apache.org/jira/browse/YARN-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631039#comment-16631039
]
Haibo Chen commented on YARN-8808:
----------------------------------
Ah.. Got you.
{quote}I was just saying.. we need an additional check to see if either one of
them (you are proposing to use the former in this JIRA) is {{0}}
{quote}
Not sure checking nodeUtilization makes sense to me. Let's take a more extreme
case for example:
1) A node (hardware) has 100 GB capacity, and we're sharing the node with other
YARN stuff, so we'd configure NMs to limit aggregate container allocation on
each node to 10 GBs.
2) The scheduler would see this 'Node' has 10GBs to allocate because that's
what NM tells RM. I believe in this case, YARN should try to fully utilize just
10GBs instead of the whole node (100 GBs), because YARN is entitled to use only
10GBs. If 10GBs is indeed fully utilized, the aggregate container utilization
is 100%, but the nodeUtilization is 10% (Again, node utilization by default is
detected by some plugin on NM side that reads from /proc and sees the remaining
system-wide 90GBs as available). Don't think we shall check if nodeUtilization
is low.
{quote}- Node has capacity for 4 1GB containers, but is currently running 2
containers each using more than 1.9GB - in this case, overallocation should be
allowed.
{quote}
I am not following here. Node has a capacity of 4GBs, 2 containers each using
1.9GB, so the aggregate container utilization and node utilization are both
high, no? Node capacity and utilization don't have anything to do with # of
containers, do they?
> Use aggregate container utilization instead of node utilization to determine
> resources available for oversubscription
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: YARN-8808
> URL: https://issues.apache.org/jira/browse/YARN-8808
> Project: Hadoop YARN
> Issue Type: Sub-task
> Affects Versions: YARN-1011
> Reporter: Haibo Chen
> Assignee: Haibo Chen
> Priority: Major
> Attachments: YARN-8088-YARN-1011.01.patch,
> YARN-8808-YARN-1011.00.patch
>
>
> Resource oversubscription should be bound to the amount of the resources that
> can be allocated to containers, hence the allocation threshold should be with
> respect to aggregate container utilization.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]