[ 
https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760582#comment-13760582
 ] 

Karthik Kambatla commented on YARN-1027:
----------------------------------------

Thanks for the detailed review, [~bikas]. 

bq. What are the pros of making haState a member of ResourceManager instead of 
HAServiceProtocol? A pro of the latter is that it keeps all HA stuff in one 
place.
In the future, when individual external-facing services need to behave based on 
the HAState, having it in the RM might be useful. However, I think we should 
move it to RMHAProtocolService now, and move it to the RM or RMContext lazily.

bq. Why is there a lock used in ResourceManager.startActive() etc. Why are 
these methods protected. If testing, then lets add an @visiblefortesting 
annotation.
The lock is to protect against concurrent invocations of transitionToActive() 
and transitionToStandby() due to say user input. The methods are protected 
because they are being accessed from outside the RM - in this case, 
RMHAProtocolService.

bq. Is there a way to confirm that the active service objects are all being 
GC'd?
Not sure of a deterministic test. How about using Runtime.memory methods to 
measure memory usage before and after transitioning to Active and subsequently 
Standby? 
I can jmap a real RM on a pseudo-dist cluster and see if they are being cleaned 
up. 

bq. Didnt quite get this comment. Is this do with change being requested by 
user/admin/ZKFC?
If automatic failover is enabled and a user issues a transition command, it 
should take effect only when it is "forced".

Agree with remaining comments. Will fix it in the next version.
                
> Implement RMHAServiceProtocol
> -----------------------------
>
>                 Key: YARN-1027
>                 URL: https://issues.apache.org/jira/browse/YARN-1027
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Karthik Kambatla
>         Attachments: test-yarn-1027.patch, yarn-1027-1.patch, 
> yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-including-yarn-1098-3.patch, 
> yarn-1027-in-rm-poc.patch
>
>
> Implement existing HAServiceProtocol from Hadoop common. This protocol is the 
> single point of interaction between the RM and HA clients/services.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to