[jira] [Updated] (YARN-4756) Unnecessary wait in Node Status Updater during reboot

Eric Badger (JIRA) Thu, 03 Mar 2016 11:15:41 -0800

     [ 
https://issues.apache.org/jira/browse/YARN-4756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Eric Badger updated YARN-4756:
------------------------------
    Attachment: YARN-4756.001.patch

The optimization to notify the Node Status Updater thread to stop waiting for a 
heartbeat exposes a race condition in the test 
TestNodeManagerResync#testContainerResourceIncreaseIsSynchronizedWithRMResync. 
The test checks the current resources of the NM, then checks for it again since 
a different thread changes the current resources. However, there is no 
synchronization between these threads and it was only working because of the 
excessive wait time from the reboot. The patch adds in a barrier to synchronize 
these two threads. 

> Unnecessary wait in Node Status Updater during reboot
> -----------------------------------------------------
>
>                 Key: YARN-4756
>                 URL: https://issues.apache.org/jira/browse/YARN-4756
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Eric Badger
>            Assignee: Eric Badger
>         Attachments: YARN-4756.001.patch
>
>
> The Node Status Updater thread waits for the isStopped variable to be set to 
> true, but it is waiting for the next heartbeat. During a reboot, the next 
> heartbeat will not come and so the thread waits for a timeout. Instead, we 
> should notify the thread to continue so that it can check the isStopped 
> variable and exit without having to wait for a timeout. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-4756) Unnecessary wait in Node Status Updater during reboot

Reply via email to