[jira] [Commented] (YARN-2884) Proxying all AM-RM communications

Jian He (JIRA) Wed, 26 Aug 2015 23:25:12 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716153#comment-14716153
 ]


Jian He commented on YARN-2884:
-------------------------------

Looks good to me overall, I think there are still some problems with the 
AMRMProxyToken implementation. Basically, long running service may not work 
with the AMRMProxy.

1) below code in DefaultRequestInterceptor should create and return a new 
AMRMProxyToken in the final returned allocate response when needed. Otherwise, 
AM will fail to talk with AMRMTokenProxy after the key is rolled over in the 
AMRMTokenProxySecretManager. 
{code}
  @Override
  public AllocateResponse allocate(AllocateRequest request)
      throws YarnException, IOException {
    if (LOG.isDebugEnabled()) {
      LOG.debug("Forwarding allocate request to the real YARN RM");
    }
    AllocateResponse allocateResponse = rmClient.allocate(request);
    if (allocateResponse.getAMRMToken() != null) {
      updateAMRMToken(allocateResponse.getAMRMToken());
    }
    return allocateResponse; <====
  }
{code}
 Below code in ApplicationMasterService#allocate shows how that is done.
{code}
      if (nextMasterKey != null
          && nextMasterKey.getMasterKey().getKeyId() != amrmTokenIdentifier
            .getKeyId()) {
        RMAppAttemptImpl appAttemptImpl = (RMAppAttemptImpl)appAttempt;
        Token<AMRMTokenIdentifier> amrmToken = appAttempt.getAMRMToken();
        if (nextMasterKey.getMasterKey().getKeyId() !=
            appAttemptImpl.getAMRMTokenKeyId()) {
          LOG.info("The AMRMToken has been rolled-over. Send new AMRMToken back"
              + " to application: " + applicationId);
          amrmToken = rmContext.getAMRMTokenSecretManager()
              .createAndGetAMRMToken(appAttemptId);
          appAttemptImpl.setAMRMToken(amrmToken);
        }
        allocateResponse.setAMRMToken(org.apache.hadoop.yarn.api.records.Token
          .newInstance(amrmToken.getIdentifier(), amrmToken.getKind()
            .toString(), amrmToken.getPassword(), amrmToken.getService()
            .toString()));
      }
{code}
2)  Some methods inside the AMRMProxyTokenSecretManager are not used at all. we 
may remove them ?

3) I think we need at least 1 end-to-end test for this. We can use 
MiniYarnCluster to simulate the whole thing. AM  talks with AMRMProxy which  
talks with RM to register/allocate/finish. In the test, we should also reduce 
the RM_AMRM_TOKEN_MASTER_KEY_ROLLING_INTERVAL_SECS so that we can simulate the 
token renew behavior.  I'm ok to have a separate jira to track the end-to-end 
test, as this is a bit of work.


> Proxying all AM-RM communications
> ---------------------------------
>
>                 Key: YARN-2884
>                 URL: https://issues.apache.org/jira/browse/YARN-2884
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Carlo Curino
>            Assignee: Kishore Chaliparambil
>         Attachments: YARN-2884-V1.patch, YARN-2884-V10.patch, 
> YARN-2884-V11.patch, YARN-2884-V2.patch, YARN-2884-V3.patch, 
> YARN-2884-V4.patch, YARN-2884-V5.patch, YARN-2884-V6.patch, 
> YARN-2884-V7.patch, YARN-2884-V8.patch, YARN-2884-V9.patch
>
>
> We introduce the notion of an RMProxy, running on each node (or once per 
> rack). Upon start the AM is forced (via tokens and configuration) to direct 
> all its requests to a new services running on the NM that provide a proxy to 
> the central RM. 
> This give us a place to:
> 1) perform distributed scheduling decisions
> 2) throttling mis-behaving AMs
> 3) mask the access to a federation of RMs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2884) Proxying all AM-RM communications

Reply via email to