[
https://issues.apache.org/jira/browse/YARN-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323728#comment-16323728
]
stefanlee edited comment on YARN-7672 at 1/12/18 9:18 AM:
----------------------------------------------------------
[~zsl2007] thanks for this jira. i have merged this patch to my hadoop version
and there is a problem occurred during my testing.
{code:java}
1. RM1 is active ,RM2 is standby
2. i run SLSRunnerForRealRM and my jobs will running in my cluster with correct
user name and queue name.
then:
1. RM1 is standby , RM2 is active
2. i run SLSRunnerForRealRM and my jobs will failover to RM2, then them will
running in my cluster with the user who
run SLSRunnerForRealRM. that is ,them will running in one queue.
{code}
i review the hadoop resource and found this prolem occurred in
*ConfiguredRMFailoverProxyProivder.getProxyInternal->RMProxy.getProxy*
{code:java}
static <T> T getProxy(final Configuration conf,
final Class<T> protocol, final InetSocketAddress rmAddress)
throws IOException {
return UserGroupInformation.getCurrentUser().doAs(
new PrivilegedAction<T>() {
@Override
public T run() {
return (T) YarnRPC.create(conf).getProxy(protocol, rmAddress, conf);
}
});
{code}
here, it will *getCurrentUser()*, so we should come up with a solution to
resolve it.
but if we have only one RM, it will run well.:D
was (Author: imstefanlee):
[~zsl2007] thanks for this jira. i have merged this patch to my hadoop version
and there is a problem occurred during my testing.
{code:java}
1. RM1 is active ,RM2 is standby
2. i run SLSRunnerForRealRM and my jobs will running in my cluster with correct
user name and queue name.
then:
1. RM1 is standby , RM2 is active
2. i run SLSRunnerForRealRM and my jobs will failover to RM2, then them will
running in my cluster with the user who
run SLSRunnerForRealRM. that is ,them will running in one queue.
{code}
i review the hadoop resource and found this prolem occurred in
*ConfiguredRMFailoverProxyProivder.getProxyInternal->RMProxy.getProxy*
{code:java}
static <T> T getProxy(final Configuration conf,
final Class<T> protocol, final InetSocketAddress rmAddress)
throws IOException {
return UserGroupInformation.getCurrentUser().doAs(
new PrivilegedAction<T>() {
@Override
public T run() {
return (T) YarnRPC.create(conf).getProxy(protocol, rmAddress, conf);
}
});
{code}
here, it will *getCurrentUser()*, so we should come up with a solution to
resolve it.
> hadoop-sls can not simulate huge scale of YARN
> ----------------------------------------------
>
> Key: YARN-7672
> URL: https://issues.apache.org/jira/browse/YARN-7672
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: zhangshilong
> Assignee: zhangshilong
> Attachments: YARN-7672.patch
>
>
> Our YARN cluster scale to nearly 10 thousands nodes. We need to do scheduler
> pressure test.
> Using SLS,we start 2000+ threads to simulate NM and AM. But cpu.load very
> high to 100+. I thought that will affect performance evaluation of
> scheduler.
> So I thought to separate the scheduler from the simulator.
> I start a real RM. Then SLS will register nodes to RM,And submit apps to RM
> using RM RPC.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]