[ https://issues.apache.org/jira/browse/YARN-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14202069#comment-14202069 ]
Jason Lowe commented on YARN-2632: ---------------------------------- Thanks for taking this up, Junping, and for reviews, Varun! Further comments on the latest patch: In Preconditions the wording of the first sentence is a bit off and we're inconsistent with the use of "nodemanager" vs. "NodeManager." I suggest something like the following: {noformat} Ephemeral ports (port 0, which is the default) cannot be used for the NodeManager address because the NodeManager may restart with a different address. {noformat} "that is waitting" s/b "that are waiting" Wondering if we should simply put what's currently in the Preconditions section down in the steps for enabling NM restart. Arguably it's just the third step in the config, and we can put a line or two of explanation next to the instructions. That way if someone just scans down to the steps to enable it, they will also see that they have to not only set yarn.nodemanager.recovery.enabled and yarn.nodemanager.recovery.dir but also change yarn.nodemanager.address. The mapreduce_shuffle auxiliary service and any other auxiliary service also needs to be configured to support NM restart (e.g.: avoid using ephemeral ports or otherwise support recovering with the same address). mapreduce_shuffle uses mapreduce.shuffle.port for the shuffle port, for example, and it, too, doesn't support ephemeral ports for restart. There should also be a caveat that configured auxiliary services must support recovery or otherwise NM functionality may be affected in those areas upon restart. "To enable NM Restart functionality, set ..." should just be "Set ..." because the line just before this already says "Enabling NM Restart consists of ...". > Document NM Restart feature > --------------------------- > > Key: YARN-2632 > URL: https://issues.apache.org/jira/browse/YARN-2632 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager > Reporter: Junping Du > Assignee: Junping Du > Priority: Blocker > Attachments: YARN-2632-v2.patch, YARN-2632-v3.patch, YARN-2632.patch > > > As a new feature to YARN, we should document this feature's behavior, > configuration, and things to pay attention. -- This message was sent by Atlassian JIRA (v6.3.4#6332)