[
https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13750258#comment-13750258
]
Bikas Saha commented on YARN-1027:
----------------------------------
Its a good idea to draft a path in which the HA protocol becomes another
service within the RM. We should think through various
startup/transitionToActive()/transitionToStandby() scenarios to determine the
best approach to code this.
E.g. repeated transitions from active->standby->active for the same RM without
bringing the process down. This means that all apps in the RM (ie all internal
stateful objects like appmanager, scheduler, rmappimpl etc etc) should all be
completely cleaned up during transitionToStanbdy(). Currently the RM simply
shuts down and hence that cleanup is not necessary.
This may also suggest that we logically divide RM internal objects into 2
groups 1) stuff that can be started once and kept on until RM stops 2) stuff
that needs to be cleaned every time the RM is standby and re-inited when the RM
is active. The second group would contain things like the scheduler while the
first would contain things like the RPC services. The first set would be
transparent to HA while the second set would need to be aware of HA.
Perhaps before we tackle this jira to completion, we should open and commit
another jira that identifies all stateful objects within the RM and adds
support to clean them up during RM shutdown. Those cleanup methods can be
re-used during transitionToStandby(). This jira can build on top of that.
> Implement RMHAServiceProtocol
> -----------------------------
>
> Key: YARN-1027
> URL: https://issues.apache.org/jira/browse/YARN-1027
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Bikas Saha
> Assignee: Karthik Kambatla
> Attachments: yarn-1027-1.patch
>
>
> Implement existing HAServiceProtocol from Hadoop common. This protocol is the
> single point of interaction between the RM and HA clients/services.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira