[
https://issues.apache.org/jira/browse/YARN-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xuan Gong updated YARN-275:
---------------------------
Attachment: YARN-270.1.patch
1.In RMNode,store the NodeUpdatedDone in a field and make the scheduler set the
status of the event as processed once it is done processing
2.Then in RTS, before it sends another status update to RMNode, check if
RMNode's last update was processed, if not,ask the NM to back off
3.implemented the back off by sending the next heartbeat interval to remote NMs.
Originally NodeStatusUpdater pings every hard coded 1 second, now change it so
that the next heartbeat interval comes from RM (now hard code as 5S. Maybe we
need find another way to do it)
> Make NodeManagers to NOT blindly heartbeat irrespective of whether previous
> heartbeat is processed or not.
> ----------------------------------------------------------------------------------------------------------
>
> Key: YARN-275
> URL: https://issues.apache.org/jira/browse/YARN-275
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager, resourcemanager
> Reporter: Vinod Kumar Vavilapalli
> Assignee: Xuan Gong
> Attachments: YARN-270.1.patch
>
>
> We need NMs to back off. The event handler mechanism is very scalable but not
> infinitely so :)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira