MENG DING commented on YARN-1902:

Thanks [~bikassaha] and [~vinodkv] for the education and background info. 
Really helpful. I can now appreciate that there is not a straightforward 
solution to this problem.

Originally I was coming from a pure user experience point of view, where I was 
thinking that if I ever want to use removeContainerRequest, it should only be 
because that I need to cancel previous add requests. Yes I may still get the 
number of containers from the previous requests, but that is understandable. 
However, I would have never thought that I still need to do 
removeContainerRequest to remove requests of matched containers in order to 
make the internal bookkeeping of AMRMClient correct. Why should a user worry 
about these things?

After reading the comments, I start to think that even if we were able to 
figure out which ResourceRequest to deduct from and automatically deduct it at 
the Client, it still won't solve race condition 1 (i.e., allocated containers 
are sitting in RM).

So rather than changing the client, can we not do something at the RM side? For 
example, in AppSchedulingInfo:
1. Maintain a table for total request *only*. The updateResourceRequests() call 
will update this table to reflect the total requests from the client (matching 
the client side remoteRequestsTable).
2. Maintain a table for requests that have been satisfied. Every time a 
successful allocation is made for this application, this table is updated.
3. The difference between table 1 and table 2 will be the outstanding resource 
requests. This table is updated at every updateResourceRequests() and every 
successful allocation. Of course proper synchronization needs to be taken care 
4. The scheduling will be made based on the table 3 (i.e., the outstanding 
request table). 

Do you think if this is something worth considering?

Thanks a lot in advance.

> Allocation of too many containers when a second request is done with the same 
> resource capability
> -------------------------------------------------------------------------------------------------
>                 Key: YARN-1902
>                 URL: https://issues.apache.org/jira/browse/YARN-1902
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.2.0, 2.3.0, 2.4.0
>            Reporter: Sietse T. Au
>            Assignee: Sietse T. Au
>              Labels: client
>         Attachments: YARN-1902.patch, YARN-1902.v2.patch, YARN-1902.v3.patch
> Regarding AMRMClientImpl
> Scenario 1:
> Given a ContainerRequest x with Resource y, when addContainerRequest is 
> called z times with x, allocate is called and at least one of the z allocated 
> containers is started, then if another addContainerRequest call is done and 
> subsequently an allocate call to the RM, (z+1) containers will be allocated, 
> where 1 container is expected.
> Scenario 2:
> No containers are started between the allocate calls. 
> Analyzing debug logs of the AMRMClientImpl, I have found that indeed a (z+1) 
> are requested in both scenarios, but that only in the second scenario, the 
> correct behavior is observed.
> Looking at the implementation I have found that this (z+1) request is caused 
> by the structure of the remoteRequestsTable. The consequence of Map<Resource, 
> ResourceRequestInfo> is that ResourceRequestInfo does not hold any 
> information about whether a request has been sent to the RM yet or not.
> There are workarounds for this, such as releasing the excess containers 
> received.
> The solution implemented is to initialize a new ResourceRequest in 
> ResourceRequestInfo when a request has been successfully sent to the RM.
> The patch includes a test in which scenario one is tested.

This message was sent by Atlassian JIRA

Reply via email to