[ 
https://issues.apache.org/jira/browse/YARN-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15345623#comment-15345623
 ] 

Sunil G commented on YARN-4280:
-------------------------------

bq.So yes, we need the other queues to stop allocating until the 
higher-priority queue's allocation is satisfied or we have a priority inversion 
and indefinite postponement issues.

Thanks [~jlowe] for restating the problem. Yes, I think i got the intention 
correctly. This is a case to handle.

bq.expand the existing CSAssignent skipped boolean to be an enumeration of 
skipped types.
Recently for a YARN-4091 POC work, I was looking into various enums returned 
from allocation call flow. Yes, Its better if we add this flags like  
"queue-limit-skipped" to CSAssignment as an enum instead of "skipped" boolean. 
It can help to propagate the real reason to queue level.



> CapacityScheduler reservations may not prevent indefinite postponement on a 
> busy cluster
> ----------------------------------------------------------------------------------------
>
>                 Key: YARN-4280
>                 URL: https://issues.apache.org/jira/browse/YARN-4280
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>    Affects Versions: 2.6.1, 2.8.0, 2.7.1
>            Reporter: Kuhu Shukla
>            Assignee: Kuhu Shukla
>         Attachments: YARN-4280.001.patch, YARN-4280.002.patch, 
> YARN-4280.003.patch, YARN-4280.004.patch
>
>
> Consider the following scenario:
> There are 2 queues A(25% of the total capacity) and B(75%), both can run at 
> total cluster capacity. There are 2 applications, appX that runs on Queue A, 
> always asking for 1G containers(non-AM) and appY runs on Queue B asking for 2 
> GB containers.
> The user limit is high enough for the application to reach 100% of the 
> cluster resource. 
> appX is running at total cluster capacity, full with 1G containers releasing 
> only one container at a time. appY comes in with a request of 2GB container 
> but only 1 GB is free. Ideally, since appY is in the underserved queue, it 
> has higher priority and should reserve for its 2 GB request. Since this 
> request puts the alloc+reserve above total capacity of the cluster, 
> reservation is not made. appX comes in with a 1GB request and since 1GB is 
> still available, the request is allocated. 
> This can continue indefinitely causing priority inversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to