[jira] [Commented] (YARN-2884) Proxying all AM-RM communications

Subru Krishnan (JIRA) Thu, 23 Jul 2015 17:39:58 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14639740#comment-14639740
 ]


Subru Krishnan commented on YARN-2884:
--------------------------------------

To give more context on the approach we took, please find below the summary of 
the offline discussions we had with [~kishorch], [~jianhe],[~leftnoteasy], 
[~zjshen], [~kkaranasos],[~chris.douglas].

One of the main drivers for the discussion was whether AMRMProxy service needs 
to be man-in-the-middle between RM and NM in order for sucessful SASL 
handshake. On investigation we realized that it was necessary for us to swap 
the AMRMToken as AM would register with the AMRMProxy service instead of the RM 
& we need to validate the AMRMToken. To achieve this we need either the RM’s 
secret key or generate & swap AMRMToken in the AMRMProxy and we went for the 
latter approach for obvious reasons. 
We considered a few options to plug in AMRMProxy to the NM:
   · Adding AMRMProxy as an auxiliary service: This looked the minimally 
invasive method but AMRMProxy requires access to NM state (SecretManager for 
generating local AMRMTokens, StateStore for persisting/recovering across NM 
restarts without killing the AM, etc). We want to isolate aux services from the 
NM and hence do not want to provide access to internal states. 
   · Making the NM ContainerManager pluggable and implementing AMRMProxy as a 
custom ContainerManager that extends the default ContainerManagerImpl: This 
would give us all the leverage needed to implement the AMRMProxy, i.e. access 
to the NM context, ability to man-in-the-middle container lifecycle events, 
etc. But this would increase the complexity of the already heavy 
ContainerManager as we plan to support multiple handlers like Federation 
(YARN-2915), distributed scheduling (YARN-2877) in the AMRMProxy. Additionally 
we want to retain the flexibility of deploying AMRMProxy as an independent 
daemon in the future.

So the final approach we decided was to plug in AMRMProxy as an independent 
first class service in the NM and have a flag to enable/disable it. We added an 
AM container pre-start hook in the ContainerManager where we swap the AMRMToken 
issued by the RM with one issued locally by the AMRMProxy. On receiving the 
register application call, AMRMProxy swaps back the original token issued by RM 
and forwards the request.

> Proxying all AM-RM communications
> ---------------------------------
>
>                 Key: YARN-2884
>                 URL: https://issues.apache.org/jira/browse/YARN-2884
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Carlo Curino
>            Assignee: Kishore Chaliparambil
>         Attachments: YARN-2884-V1.patch
>
>
> We introduce the notion of an RMProxy, running on each node (or once per 
> rack). Upon start the AM is forced (via tokens and configuration) to direct 
> all its requests to a new services running on the NM that provide a proxy to 
> the central RM. 
> This give us a place to:
> 1) perform distributed scheduling decisions
> 2) throttling mis-behaving AMs
> 3) mask the access to a federation of RMs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2884) Proxying all AM-RM communications

Reply via email to