[ 
https://issues.apache.org/jira/browse/YARN-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628492#comment-13628492
 ] 

Omkar Vinit Joshi commented on YARN-539:
----------------------------------------

bq. Resource doesn't have life, so it can't 'fail'. In that sense, shall we 
rename ResourceFailedEvent to ResourceFailedLocalizationEvent? Similarly 
ResourceEventType.FAILED to ResourceEventType.LOCALIZATION_FAILED?
Fixed.

bq. Dismantle localizationCompleted altogether? Makes code much more readable 
IMO.
Yeah.. this is no longer required and can be simplified :) .. updating handle 
accordingly.

bq. The log message for release doesn't need to specifically talk about failed 
resources. A release on a resource that is long gone for whatever reason will 
run into this code-path.
Yes you are right ... ex. if resource's local file is deleted (becomes 
inaccessible for some reason) then too we will end up getting these messages...

bq. Not related to your patch, but the code for REQUEST can simplified by doing 
the null check first.
No. I think the flow is correct.
* check if resource is not null and present on disk if not then remove it from 
cache.
* now if resource is null -> We can have below two possibilities. In both cases 
we need to recreate the resource.
** Either resource's local copy is inaccessible 
** or resource request is coming for the first time


bq. Null checks needed for rsrc on LOCALIZED and FAILED events?

Can never occur. as when these events come resource will be there in 
DOWNLOADING state. Will never be removed because for this resource ref count > 
0 (ResourceRetentionSet.addResources).
                
> LocalizedResources are leaked in memory in case resource localization fails
> ---------------------------------------------------------------------------
>
>                 Key: YARN-539
>                 URL: https://issues.apache.org/jira/browse/YARN-539
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Omkar Vinit Joshi
>            Assignee: Omkar Vinit Joshi
>         Attachments: yarn-539-20130410.1.patch, yarn-539-20130410.patch
>
>
> If resource localization fails then resource remains in memory and is
> 1) Either cleaned up when next time cache cleanup runs and there is space 
> crunch. (If sufficient space in cache is available then it will remain in 
> memory).
> 2) reused if LocalizationRequest comes again for the same resource.
> I think when resource localization fails then that event should be sent to 
> LocalResourceTracker which will then remove it from its cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to