[ 
https://issues.apache.org/jira/browse/YARN-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengbing Liu updated YARN-3024:
--------------------------------
    Attachment: YARN-3024.03.patch

Updated patch.

I refactored {{LocalizerRunner#update()}}, separating the following two phase:
* first read and process resource statuses from the localizer through heartbeat
* then find the next resource to be localized and send it through response

Also, in the original code base, there is a small problem about the response 
action. Now if one of the following conditions is met, the response action will 
be DIE:
* Got at least one FETCH_FAILURE
* {{findNextResource()}} returns null, and {{pending}} is empty


> LocalizerRunner should give DIE action when all resources are localized
> -----------------------------------------------------------------------
>
>                 Key: YARN-3024
>                 URL: https://issues.apache.org/jira/browse/YARN-3024
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.6.0
>            Reporter: Chengbing Liu
>            Assignee: Chengbing Liu
>         Attachments: YARN-3024.01.patch, YARN-3024.02.patch, 
> YARN-3024.03.patch
>
>
> We have observed that {{LocalizerRunner}} always gives a LIVE action at the 
> end of localization process.
> The problem is {{findNextResource()}} can return null even when {{pending}} 
> was not empty prior to the call. This method removes localized resources from 
> {{pending}}, therefore we should check the return value, and gives DIE action 
> when it returns null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to