[ 
https://issues.apache.org/jira/browse/YARN-373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13570782#comment-13570782
 ] 

Alejandro Abdelnur commented on YARN-373:
-----------------------------------------

Hitesh,

Didn't dive into the whole approach yet, first wanted to 'socialize' the idea. 
Now let me answer with my current thoughts.

For this use case I was not thinking about resizing 'inflight' containers, 
while we could resize easily on CPU, for memory would be quite difficult. 

The use case is about shortcutting getting resources for a container by reusing 
the same (or less) resources being freed up by a terminating container in the 
same node. By doing this you don't have to go to all the way to the scheduler 
and compete/wait for those resources to become avail. In short, recycling 
resources the AM already got.

The terminating container would still exit, not changing the notion of 
completion of a container. The container using the recycled resources would be 
a fresh new container process. (Otherwise we could not shrink in memory).

Regarding localized resources, a new resource localization would be done.
                
> Allow an AM to reuse the resources allocated to container for a new container
> -----------------------------------------------------------------------------
>
>                 Key: YARN-373
>                 URL: https://issues.apache.org/jira/browse/YARN-373
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 2.0.3-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>
> When a container completes, instead the corresponding resources being freed 
> up, it should be possible for the AM to reuse the assigned resources for a 
> new container.
> As part of the reallocation, the AM would notify the RM about partial 
> resources being freed up and the RM would make the necessary corrections in 
> the corresponding node.
> With this functionality, an AM can ensure it gets a container in the same 
> node where previous containers run.
> This will allow getting rid of the ShuffleHandler as a service in the NMs and 
> run it as regular container task of the corresponding AM. In this case, the 
> reallocation would reduce the CPU/MEM obtained for the original container to 
> the what is needed for serving the shuffle. Note that in this example the MR 
> AM would only do this reallocation for one of the many tasks that may have 
> run in a particular node (as a single shuffle task could serve all the map 
> outputs from all map tasks run in that node). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to