[ 
https://issues.apache.org/jira/browse/YARN-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brook Zhou updated YARN-7098:
-----------------------------
    Attachment: YARN-7098.patch

> LocalizerRunner should immediately send heartbeat response 
> LocalizerStatus.DIE when the Container transitions from LOCALIZING to KILLING
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-7098
>                 URL: https://issues.apache.org/jira/browse/YARN-7098
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Brook Zhou
>            Assignee: Brook Zhou
>            Priority: Minor
>         Attachments: YARN-7098.patch
>
>
> Currently, the following can happen:
> 1. ContainerLocalizer heartbeats to ResourceLocalizationService.
> 2. LocalizerTracker.processHeartbeat verifies that there is a LocalizerRunner 
> for the localizerId (containerId). Goes into {code:java}return 
> localizer.processHeartbeat(status.getResources());{code}
> 3. Container receives kill event, goes from LOCALIZING -> KILLING. The 
> LocalizerRunner is removed from LocalizerTracker, since the privLocalizers 
> lock is now free.
> 4. Since check (2) passed, LocalizerRunner sends heartbeat response with 
> LocalizerStatus.LIVE and the next file to download.
> What should happen here is that (4) sends a LocalizerStatus.DIE, since (3) 
> happened before the heartbeat response in (4). This saves the container from 
> potentially downloading an extra resource due to the one extra LIVE heartbeat 
> which will end up being deleted anyway.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to