[
https://issues.apache.org/jira/browse/YARN-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711766#comment-14711766
]
Subru Krishnan commented on YARN-2884:
--------------------------------------
[~vinodkv], thanks for your feedback. Let me first reiterate what I said to
[~jlowe]'s similar observation, I agree not only that we should move towards a
better scheduler discovery model but completely decouple apps from platform
configs. The reason we didn't go down the path you have suggested is it puts a
dependency on updating all the AMs (which we don't own unlike Timeline service)
to use the new discovery mechanism. The current approach though non-ideal is
agnostic to AM. To force the AMs to do just that, we should prevent access to
the NM's config. If all of you are OK with the consequence, I can go ahead and
make the change.
I think it'll be better if we open a separate JIRA to address the decoupling of
app & platform config with an initial sub-task to handle scheduler discovery
through environment as you suggested? In that case, we'll update the patch to
remove the changes in ContainerLaunch that overrides the HADOOP_CONF_DIR and
AFAIK, [~jianhe] is OK with rest of the patch which he can commit asap. This
will unblock us to use AMRMProxy with at least self contained apps like
MapReduce, Spark which is our major workload.
> Proxying all AM-RM communications
> ---------------------------------
>
> Key: YARN-2884
> URL: https://issues.apache.org/jira/browse/YARN-2884
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager, resourcemanager
> Reporter: Carlo Curino
> Assignee: Kishore Chaliparambil
> Attachments: YARN-2884-V1.patch, YARN-2884-V2.patch,
> YARN-2884-V3.patch, YARN-2884-V4.patch, YARN-2884-V5.patch,
> YARN-2884-V6.patch, YARN-2884-V7.patch, YARN-2884-V8.patch, YARN-2884-V9.patch
>
>
> We introduce the notion of an RMProxy, running on each node (or once per
> rack). Upon start the AM is forced (via tokens and configuration) to direct
> all its requests to a new services running on the NM that provide a proxy to
> the central RM.
> This give us a place to:
> 1) perform distributed scheduling decisions
> 2) throttling mis-behaving AMs
> 3) mask the access to a federation of RMs
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)