Sietse T. Au commented on YARN-1902:

All solutions will still be workarounds unless the protocol is revised. 

Another workaround would be to keep track of the requests by counting the 
number of requested containers and not sending new container requests to RM 
until the previous batch has been satisfied.

Consider the following scenario in the following order:
1. addContainerRequest is called n times and at each call the 
expectedContainers counter is incremented, the container request is added to a 
list of currentContainerRequests. 
2. allocate is called, a boolean waitingForResponse is set to true when 
ask.size > 0 which indicates container requests have been made.
3. addContainerRequest is called m times, since waitingForResponse is true, the 
request will be added to a list of queuedContainerRequests, the asks will be 
added to asksQueue and not asks. 
4. allocate is called, n - 1 containers are returned, expectedContainers will 
be decremented by n - 1.
5. allocate is called again, 1 container is returned, expectedContainers will 
be  0, 
waitingForResponse is set to false, 
for each currentContainerRequest removeContainerRequest,
currentContainerRequests = queuedContainerRequests, 
asks = asksQueue, 
expectedContainers = queuedContainerRequests.size
6. allocate is called and (3) will be submitted. 

Here, the satisfied container requests will be correctly removed from the table 
without user intervention and seems to apply to common use cases, excess 
containers now will only happen when containerRequest is removed after an 
allocate. But since there is no guarantee that it will be removed in time at 
the RM, it doesn't seem to be very significant. 

One problem here is that the expectedContainers will be invalid when you do the 
blacklist all the possible nodes, add container request, allocate,  remove 
blacklist, add container request, allocate.
This would make the client wait forever for a response of the first request as 
it will never be satisfied.

I'm not sure what else can be done by users apart from extending the 
AMRMClientImpl to fit their use case.

> Allocation of too many containers when a second request is done with the same 
> resource capability
> -------------------------------------------------------------------------------------------------
>                 Key: YARN-1902
>                 URL: https://issues.apache.org/jira/browse/YARN-1902
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.2.0, 2.3.0, 2.4.0
>            Reporter: Sietse T. Au
>            Assignee: Sietse T. Au
>              Labels: client
>         Attachments: YARN-1902.patch, YARN-1902.v2.patch, YARN-1902.v3.patch
> Regarding AMRMClientImpl
> Scenario 1:
> Given a ContainerRequest x with Resource y, when addContainerRequest is 
> called z times with x, allocate is called and at least one of the z allocated 
> containers is started, then if another addContainerRequest call is done and 
> subsequently an allocate call to the RM, (z+1) containers will be allocated, 
> where 1 container is expected.
> Scenario 2:
> No containers are started between the allocate calls. 
> Analyzing debug logs of the AMRMClientImpl, I have found that indeed a (z+1) 
> are requested in both scenarios, but that only in the second scenario, the 
> correct behavior is observed.
> Looking at the implementation I have found that this (z+1) request is caused 
> by the structure of the remoteRequestsTable. The consequence of Map<Resource, 
> ResourceRequestInfo> is that ResourceRequestInfo does not hold any 
> information about whether a request has been sent to the RM yet or not.
> There are workarounds for this, such as releasing the excess containers 
> received.
> The solution implemented is to initialize a new ResourceRequest in 
> ResourceRequestInfo when a request has been successfully sent to the RM.
> The patch includes a test in which scenario one is tested.

This message was sent by Atlassian JIRA

Reply via email to