[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability

Vinod Kumar Vavilapalli (JIRA) Fri, 15 May 2015 17:13:17 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546418#comment-14546418
 ]


Vinod Kumar Vavilapalli commented on YARN-1902:
-----------------------------------------------

bq. Wangda Tan mentioned offline that we could at-least deduct the count 
against the over-all number (ANY request) for a given priority.
Further thought tells me this is not desired in some cases as well.

Take the following example.

User originally wants: 1 container on H1, 1 container on H2, and 2 containers 
on R1 (rack). The request table becomes
|H1|1|
|H2|1|
|R1|2|
|*|4|

Now assuming RM returns a container on R2 (rack), auto-decrementing the request 
table will make it
|H1|1|
|H2|1|
|R1|2|
|*|3|

But user may actually want something like the following. This depends on what 
the user preferences are w.r.t scheduling.
|H1|0|
|H2|1|
|R1|2|
|*|3|

> Allocation of too many containers when a second request is done with the same 
> resource capability
> -------------------------------------------------------------------------------------------------
>
>                 Key: YARN-1902
>                 URL: https://issues.apache.org/jira/browse/YARN-1902
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.2.0, 2.3.0, 2.4.0
>            Reporter: Sietse T. Au
>            Assignee: Sietse T. Au
>              Labels: client
>         Attachments: YARN-1902.patch, YARN-1902.v2.patch, YARN-1902.v3.patch
>
>
> Regarding AMRMClientImpl
> Scenario 1:
> Given a ContainerRequest x with Resource y, when addContainerRequest is 
> called z times with x, allocate is called and at least one of the z allocated 
> containers is started, then if another addContainerRequest call is done and 
> subsequently an allocate call to the RM, (z+1) containers will be allocated, 
> where 1 container is expected.
> Scenario 2:
> No containers are started between the allocate calls. 
> Analyzing debug logs of the AMRMClientImpl, I have found that indeed a (z+1) 
> are requested in both scenarios, but that only in the second scenario, the 
> correct behavior is observed.
> Looking at the implementation I have found that this (z+1) request is caused 
> by the structure of the remoteRequestsTable. The consequence of Map<Resource, 
> ResourceRequestInfo> is that ResourceRequestInfo does not hold any 
> information about whether a request has been sent to the RM yet or not.
> There are workarounds for this, such as releasing the excess containers 
> received.
> The solution implemented is to initialize a new ResourceRequest in 
> ResourceRequestInfo when a request has been successfully sent to the RM.
> The patch includes a test in which scenario one is tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability

Reply via email to