[ 
https://issues.apache.org/jira/browse/YARN-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Zhiguo updated YARN-3965:
------------------------------
    Attachment: YARN-3965-2.patch

The first patch breaks TestNMWebServices.verifyNodeInfo. Corrected in this one.

> Add starup timestamp for nodemanager
> ------------------------------------
>
>                 Key: YARN-3965
>                 URL: https://issues.apache.org/jira/browse/YARN-3965
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>            Reporter: Hong Zhiguo
>            Assignee: Hong Zhiguo
>            Priority: Minor
>         Attachments: YARN-3965-2.patch, YARN-3965.patch
>
>
> We have startup timestamp for RM already, but don't for NM.
> Sometimes cluster operator modified configuration of all nodes and kicked off 
> command to restart all NMs.  He found out it's hard for him to check whether 
> all NMs are restarted.  Actually there's always some NMs didn't restart as he 
> expected, which leads to some error later due to inconsistent configuration.
> If we have startup timestamp for NM,  the operator could easily fetch it via 
> NM webservice and find out which NM didn't restart, and take mannaul action 
> for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to