[ 
https://issues.apache.org/jira/browse/YARN-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842810#comment-13842810
 ] 

Karthik Kambatla commented on YARN-1481:
----------------------------------------

Thanks [~vinodkv] for the refactoring. The code reads better now.

As for the TODO items in the patch:
# HAServiceStatus returned by getServiceStatus() has an isReadyToBecomeActive 
method which is used by the FailoverController to check if an RM is in a state 
to become Active. This corresponds to Active and Standby states, as the RM can 
not be made Active when it is in the INITIALIZING or STOPPING states. If I have 
missed the question entirely, please let me know.
{code}
+    if (isRMActive() || haState == HAServiceProtocol.HAServiceState.STANDBY) { 
// TODO: What ?!
{code}
# Previously, AdminService was altering the Configuration and Bikas had 
suggested we set the RM's conf in case any one decides to make a copy of the 
Configuration in AdminService. Long story short, it can be removed.
{code}
+    if (this.rmContext.isHAEnabled()) {
+      HAUtil.verifyAndSetConfiguration(conf);
+      setConf(conf); // TODO: Why is this needed? Remove.
+    }
{code}
# I am not sure why we need a lock there in the first place. It should be safe 
to remove it here, as transitionToStandby itself is synchronized.
{code}
+        synchronized(rm) { // TODO: Lock which class?
+          if (rmContext.isHAEnabled()) {
             try {
               // Transition to standby and reinit active services
               LOG.info("Transitioning RM to Standby mode");
-              adminService.transitionToStandby(true);
+              rm.transitionToStandby(true);
{code}

Comments on the patch itself: 
# The field haState in ResourceManager can be private. Looks like we forgot to 
make it private along with other changes.
# {{RMContextImpl#setHAServiceState}} should be called everywhere haState is 
set in the RM, AdminService depends on this. Looks like the test failures are 
also because of this. 
# Given RMContext has the haState, it might be better to remove it altogether 
from ResourceManager and use the rmContext version.
# get and set HAState in RMContextImpl should be synchronized on haState before 
setting or getting haState

> ResourceManager and AdminService interact in a convoluted manner after 
> YARN-1318
> --------------------------------------------------------------------------------
>
>                 Key: YARN-1481
>                 URL: https://issues.apache.org/jira/browse/YARN-1481
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Vinod Kumar Vavilapalli
>         Attachments: YARN-1481-20131207.txt
>
>
> This is something I found while reviewing YARN-1318, but didn't halt that 
> patch as many cycles went there already. Some top level issues
>  - Not easy to follow RM's service life cycle
>     -- RM adds only AdminService as its service directly.
>     -- Other services are added to RM when AdminService's init calls 
> RM.activeServices.init()
>  - Overall, AdminService shouldn't encompass all of RM's HA state management. 
> It was originally supposed to be the implementation of just the RPC server.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to