[
https://issues.apache.org/jira/browse/YARN-275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566253#comment-13566253
]
Siddharth Seth commented on YARN-275:
-------------------------------------
Xuan, could you please edit the jira title to reflect the approach - instead of
asking the NM to back off - the RM will effectively try to limit the number of
scheduling loops it ends up running.
In terms of the document itself, the scheduler thread will end up blocking if
it tries to schedule heartbeats in this manner. For starters, it may just be
simpler to track whether a scheduling event exists for the node. When the RM
pulls the updated events - it can clear a flag. The next heartbeat from that NM
will add another scheduling event for the node.
Another option would be to just cycle through all nodes - check if a heartbeat
has been received "recently" - pull updated data and attempt t schedule.
> Make NodeManagers to NOT blindly heartbeat irrespective of whether previous
> heartbeat is processed or not.
> ----------------------------------------------------------------------------------------------------------
>
> Key: YARN-275
> URL: https://issues.apache.org/jira/browse/YARN-275
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager, resourcemanager
> Reporter: Vinod Kumar Vavilapalli
> Assignee: Xuan Gong
> Attachments: Prototype.txt, YARN-270.1.patch
>
>
> We need NMs to back off. The event handler mechanism is very scalable but not
> infinitely so :)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira