Subru Krishnan commented on YARN-2884:

[~jianhe] had an offline clarification - how does the AM know to connect to 
AMRMProxy and not the RM:
If AMRMProxy is enabled, we need a *HADOOP_CLIENT_CONF_DIR* in every machine in 
the cluster which has a yarn-site with _resourcemanager.scheduler.address_ 
pointing to the local AMRMProxy service and in NM *ContainerLaunch* we swap 
(only if AMRMProxy is enabled and HADOOP_CLIENT_CONF_DIR is not null to ensure 
full backward compatibility) the HADOOP_CONF_DIR in the AM container env to 
point to HADOOP_CLIENT_CONF_DIR. We tested with MapReduce, Spark & REEF and 
were able to run all of them successfully in a Federated YARN mode. 
Additionally this enhances the YARN security as currently server configs are 
leaked to all the AMs but with this change we can control every AMs view.

> Proxying all AM-RM communications
> ---------------------------------
>                 Key: YARN-2884
>                 URL: https://issues.apache.org/jira/browse/YARN-2884
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Carlo Curino
>            Assignee: Kishore Chaliparambil
>         Attachments: YARN-2884-V1.patch, YARN-2884-V2.patch, 
> YARN-2884-V3.patch, YARN-2884-V4.patch, YARN-2884-V5.patch, 
> YARN-2884-V6.patch, YARN-2884-V7.patch, YARN-2884-V8.patch, YARN-2884-V9.patch
> We introduce the notion of an RMProxy, running on each node (or once per 
> rack). Upon start the AM is forced (via tokens and configuration) to direct 
> all its requests to a new services running on the NM that provide a proxy to 
> the central RM. 
> This give us a place to:
> 1) perform distributed scheduling decisions
> 2) throttling mis-behaving AMs
> 3) mask the access to a federation of RMs

This message was sent by Atlassian JIRA

Reply via email to