[
https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13745841#comment-13745841
]
Bikas Saha commented on YARN-1027:
----------------------------------
First of all, thanks for the writing the patch and testing it. This shows that
adding HA awareness can be added without significant overhaul in the RM.
I wish I could say that I like the hybrid approach, but after reading the patch
unfortunately thats not the case.
Having a pure wrapper approach that simply does a "new ResourceManager()" upon
transitionToActive() has the virtue of being completely separate from the RM
and being simple. Having HAService built into ResourceManager as a service
integrates it completely with the ResourceManager flow and allows for features
like RPC redirect in tandem with other RM services. Taking the hybrid approach
drops the simplicity of the wrapper while at the same time making it complex to
interact with the ResourceManager. Which one is the real ResourceManager.
For example, there are many tests that use the ResourceManager but now since
they dont use HAResourceManager they are probably not exercising some
possibilities. Should they use HAResourceManager?
Fundamentally, HA is going to be an integral part of the ResourceManager and to
me it does not make sense to create a derive impl of the ResourceManager in
order to add the HA logic. What other derivations are possible for the RM that
motivate the use of inheritance and sub-classing? Why have 2 impls for
essentially the same component.
Starting up and stopping services is not super fast and will add time to the
failover. So unless there is an obstacle to that path, we should be looking at
starting as many (if not all) services on the RM so that the only thing thats
blocking failover is populating the state. Like discussed earlier, its not
necessary for all services to be started in the first cut. We can choose to
start the HA service only.
I would really encourage attempting to make HAService part of ResourceManager
itself. I can help with the patch if needed.
> Implement RMHAServiceProtocol
> -----------------------------
>
> Key: YARN-1027
> URL: https://issues.apache.org/jira/browse/YARN-1027
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Bikas Saha
> Assignee: Karthik Kambatla
> Attachments: yarn-1027-1.patch
>
>
> Implement existing HAServiceProtocol from Hadoop common. This protocol is the
> single point of interaction between the RM and HA clients/services.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira