[
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15482168#comment-15482168
]
Naganarasimha G R commented on YARN-5545:
-----------------------------------------
Thanks [~bibinchundatt], for the patch .
Few points to discuss on the approach
# Would it be good to have a separate queue partition based max application
limit similar to {{yarn.scheduler.capacity.<queue-path>.maximum-applications}}
so that there is finer control on logical partitions similar to default
partition ?
# Would it be better to set the default value of
{{yarn.scheduler.capacity.maximum-applications.accessible-node-labels.<label>}}
to that of {{yarn.scheduler.capacity.maximum-applications}}, it will make the
work of the admin much easier. similarly we can decide the same for the
previous point if we plan to adopt it.
# IIUC you seem to adopt the approach little different than what you mention in
your
[comment|https://issues.apache.org/jira/browse/YARN-5545?focusedCommentId=15453163&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15453163],
though we are having per partition level max app limit, we just sum up max
limits of all partitions under a queue and check against
{{ApplicationLimit.getAllMaxApplication()}}. If we were to not actually
validate against per Queue's PartitionLevelMaxApps then why need to come up
with a new configuration? Also consider the cases when the accessibility is *
and new partitions are added {{without refreshing}}, this configuration will be
wrong as its static.
# Need to take care of documentation which i think is missed for
*MaximumAMResourcePercentPerPartition* too, May be can be handled in a
different jira
> App submit failure on queue with label when default queue partition capacity
> is zero
> ------------------------------------------------------------------------------------
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Bibin A Chundatt
> Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch,
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
> sleep -Dmapreduce.job.node-label-expression=labelx
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 10000000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed
> to submit application_1471670113386_0001 to YARN :
> org.apache.hadoop.security.AccessControlException: Queue root.default already
> has 0 applications, cannot accept submission of application:
> application_1471670113386_0001
> at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
> at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
> at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
> at
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
> at
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit
> application_1471670113386_0001 to YARN :
> org.apache.hadoop.security.AccessControlException: Queue root.default already
> has 0 applications, cannot accept submission of application:
> application_1471670113386_0001
> at
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286)
> at
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296)
> at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301)
> ... 25 more
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]