Varun Saxena commented on YARN-2902:

[~jlowe], thanks for looking at the patch. 

The reason I inserted delete downloading flag in the protocol was to indicate 
to localizer that the resources it reported to NM in last HB were not processed 
by NM. So localizer needs to delete them. That is why an extra list of paths 
was maintained in localizer(paths which have been reported to NM for download).
I was primarily working on the principle that we can delete as much as we can 
in localizer. So that if NM crashes and its not work preserving, paths can be 
deleted. And vice versa. So 2 points of deletion can make it almost sure that 
downloading resources are deleted.

But yeah this does make it complex.

You are correct that NM will know about these paths as well and can delete 
them. The extra flag in localizer protocol thus can be removed.

I will update the patch.

> Killing a container that is localizing can orphan resources in the 
> ------------------------------------------------------------------------------------
>                 Key: YARN-2902
>                 URL: https://issues.apache.org/jira/browse/YARN-2902
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.5.0
>            Reporter: Jason Lowe
>            Assignee: Varun Saxena
>         Attachments: YARN-2902.002.patch, YARN-2902.03.patch, 
> YARN-2902.04.patch, YARN-2902.05.patch, YARN-2902.06.patch, 
> YARN-2902.07.patch, YARN-2902.patch
> If a container is in the process of localizing when it is stopped/killed then 
> resources are left in the DOWNLOADING state.  If no other container comes 
> along and requests these resources they linger around with no reference 
> counts but aren't cleaned up during normal cache cleanup scans since it will 
> never delete resources in the DOWNLOADING state even if their reference count 
> is zero.

This message was sent by Atlassian JIRA

Reply via email to