[ 
https://issues.apache.org/jira/browse/YARN-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16846407#comment-16846407
 ] 

Tao Yang commented on YARN-9576:
--------------------------------

Thanks [~jutia] for raising this issue.
I think it's indeed a problem, reservation mechanism including re-reservation 
is applicable and good enough for HB-driven scheduling process. But for global 
scheduler, all nodes can be sorted in parallel and taken as candidates in each 
scheduling process,  reservation mechanism should go forward too which needs 
more discussion. A simple solution I think is to keep counting re-reservation 
for the request according to current logic but skip generating reservation 
proposal to let scheduler have a chance to look up follow candidates for this 
request. Thoughts?

>  ResourceUsageMultiNodeLookupPolicy may cause Application starve forever
> ------------------------------------------------------------------------
>
>                 Key: YARN-9576
>                 URL: https://issues.apache.org/jira/browse/YARN-9576
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: tianjuan
>            Assignee: tianjuan
>            Priority: Major
>
> seems that ResourceUsageMultiNodeLookupPolicy in YARN-7494 may cause 
> Application starve forever
> for example, there are 10 nodes(h1,h2,...h9,h10), each has 8G memory in 
> cluster, and two queues A,B, each is configured with 50% capacity.
> firstly there are 10 jobs (each requests 6G respurce) is submited to queue A, 
> and each node of the 10 nodes will have a contianer allocated.
> Afterwards,  another job JobB which requests 3G resource is submited to queue 
> B, and there will be one container with 3G size reserved on node h1,
> with ResourceUsageMultiNodeLookupPolicy, the order policy will always be 
> h1,h2,..h9,h10, and there will always be one container re-reverved on node 
> h1, no other reservation happen,  JobB will hang forever.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to