[ 
https://issues.apache.org/jira/browse/YARN-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628115#comment-13628115
 ] 

Omkar Vinit Joshi commented on YARN-539:
----------------------------------------

The modified flow for Successful as well as Failed resource is
* Failed Resource download :- Public/Private localizer will notify tracker. 
Tracker removes the resource from its cache (No memory leak now).  Then passes 
the event to LocalizedResource. Resource send ContainerResourceFailedEvent to 
all the waiting containers. Containers in turn send ResourceReleaseEvent. 
Earlier we thought about removing this Release call but it is required as 
multiple resources requested by the container may fail one after the another 
before container's release event is handled on all the requested resources due 
to one of the resource failure.
* Successful Resource download :- Public/Private localizer will notify tracker 
which in turn will notify LocalizedResource. Resource informs all the Container 
of the successful download.
* Added Test TestLocalResourcesTrackerImpl.testLocalResourceCache for testing 
resource lifecycle and memory leak
** 2 Containers are requesting the resource. After resource failure the 
containers are informed and resource is removed from cache. Now before last 
container's ResourceReleaseEvent is handled another container requests for the 
same resource. So the ResourceReleaseEvent will return silently without 
exception. In the end after successful resource localization (for second 
attempt) and ResourceReleasEvent (by container-3) resource remains in cache in 
LOCALIZED state with zero containers in waiting queue.
                
> LocalizedResources are leaked in memory in case resource localization fails
> ---------------------------------------------------------------------------
>
>                 Key: YARN-539
>                 URL: https://issues.apache.org/jira/browse/YARN-539
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Omkar Vinit Joshi
>            Assignee: Omkar Vinit Joshi
>         Attachments: yarn-539-20130410.patch
>
>
> If resource localization fails then resource remains in memory and is
> 1) Either cleaned up when next time cache cleanup runs and there is space 
> crunch. (If sufficient space in cache is available then it will remain in 
> memory).
> 2) reused if LocalizationRequest comes again for the same resource.
> I think when resource localization fails then that event should be sent to 
> LocalResourceTracker which will then remove it from its cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to