[
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495481#comment-15495481
]
Naganarasimha G R commented on YARN-5545:
-----------------------------------------
Thanks [~sunilg],[~wangda] & [~jlowe], for taking the discussion forward.
I had few queries still
# GlobalMaximumApplicationsPerQueue doesnt have any default set right ? if set
then there is no need for {{maxSystemApps *
queueCapacities.getAbsoluteCapacity()}} as it will never reach
# IMO approach which was captured by Sunil in his earlier
[comment|https://issues.apache.org/jira/browse/YARN-5545?focusedCommentId=15494147&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15494147]
is not solving the base problem completely. Problem started with
{{maxSystemApps * queueCapacities.getAbsoluteCapacity()}}, which partition's
absolute capacity needs to be considered when for a given queue is not
overriding max applications and default capacity of the queue is zero. So based
on your approach only way to avoid it is to set
{{GlobalMaximumApplicationsPerQueue}} so this would imply that for all the
queues this value will be taken and earlier approach of {{maxSystemApps *
queueCapacities.getAbsoluteCapacity()}} will not be considered.
# I feel that {{enforce strict checking}} should have been implicit requirement
with the assumption that the admin would have not configured in a way that
queue max apps exceeds system max apps. And we need not validate the
configuration that all queue's max apps is not greater than system max apps
but just validate that while submitting the app first the system level max apps
are not getting violated and then queue level max app is not getting violated.
Thoughts ?
> App submit failure on queue with label when default queue partition capacity
> is zero
> ------------------------------------------------------------------------------------
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Bibin A Chundatt
> Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch,
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
> sleep -Dmapreduce.job.node-label-expression=labelx
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 10000000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed
> to submit application_1471670113386_0001 to YARN :
> org.apache.hadoop.security.AccessControlException: Queue root.default already
> has 0 applications, cannot accept submission of application:
> application_1471670113386_0001
> at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
> at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
> at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
> at
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
> at
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit
> application_1471670113386_0001 to YARN :
> org.apache.hadoop.security.AccessControlException: Queue root.default already
> has 0 applications, cannot accept submission of application:
> application_1471670113386_0001
> at
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286)
> at
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296)
> at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301)
> ... 25 more
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]