[
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493271#comment-15493271
]
Jason Lowe commented on YARN-5545:
----------------------------------
bq. This could be configured to set max-apps per queue level in cluster level
(queue won’t override this).
A queue-level max-app setting should always override the system-level setting.
If a user explicitly sets the max-apps setting for a particular queue then we
cannot ignore that. We already have setups today where max-apps is being tuned
at the queue-level for some queues.
Today if users set a queue-level max app limit then it overrides any
system-level limit. That means even today users are allowed to configure RMs
that can accept over the system-level app limit by explicitly overriding the
derived queue limits with specific limits that are larger. Therefore I'm
tempted to have the global queue config completely override the old
system-level max-apps config because it's akin to setting the max-apps level
for each queue explicitly. That means we operate in one of two modes: if
global queue max-apps is not set then we do what we do today and derive the
max-apps based on relative capacities. Queues that override max-apps at their
level continue to behave as they do today and get the override setting. If the
global queue max-apps is set then yarn.scheduler.capacity.maximum-applications
is completely ignored. Queues that override max-apps at their level continue
to behave as they do today and get the override setting. Queues that do not
override get the global queue setting as their max apps setting.
This preserves existing behavior if the queue is not set and is likely the
least surprising behavior when the new setting is used, especially if we
document for both the old system max-apps and global queue max-apps configs
that the latter always overrides the former when set.
> App submit failure on queue with label when default queue partition capacity
> is zero
> ------------------------------------------------------------------------------------
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Bibin A Chundatt
> Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch,
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
> sleep -Dmapreduce.job.node-label-expression=labelx
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 10000000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed
> to submit application_1471670113386_0001 to YARN :
> org.apache.hadoop.security.AccessControlException: Queue root.default already
> has 0 applications, cannot accept submission of application:
> application_1471670113386_0001
> at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
> at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
> at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
> at
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
> at
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit
> application_1471670113386_0001 to YARN :
> org.apache.hadoop.security.AccessControlException: Queue root.default already
> has 0 applications, cannot accept submission of application:
> application_1471670113386_0001
> at
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286)
> at
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296)
> at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301)
> ... 25 more
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]