[
https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553917#comment-16553917
]
Antal Bálint Steinbach edited comment on YARN-8566 at 7/25/18 6:50 AM:
-----------------------------------------------------------------------
Hi [~snemeth]
+1 LGTM (Non-binding) Thanks for the fix.
was (Author: bsteinbach):
Hi [~snemeth]
+1 Thanks for the fix.
> Add diagnostic message for unschedulable containers
> ---------------------------------------------------
>
> Key: YARN-8566
> URL: https://issues.apache.org/jira/browse/YARN-8566
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: resourcemanager
> Reporter: Szilard Nemeth
> Assignee: Szilard Nemeth
> Priority: Major
> Attachments: YARN-8566.001.patch, YARN-8566.002.patch,
> YARN-8566.003.patch, YARN-8566.004.patch
>
>
> If a queue is configured with maxResources set to 0 for a resource, and an
> application is submitted to that queue that requests that resource, that
> application will remain pending until it is removed or moved to a different
> queue. This behavior can be realized without extended resources, but it’s
> unlikely a user will create a queue that allows 0 memory or CPU. As the
> number of resources in the system increases, this scenario will become more
> common, and it will become harder to recognize these cases. Therefore, the
> scheduler should indicate in the diagnostic string for an application if it
> was not scheduled because of a 0 maxResources setting.
> Example configuration (fair-scheduler.xml) :
> {code:java}
> <allocations>
> <queueMaxAppsDefault>100000</queueMaxAppsDefault>
> <queue name="sample_queue">
> <minResources>10000 mb,2vcores</minResources>
> <maxResources>90000 mb,4vcores, 0gpu</maxResources>
> <maxRunningApps>50</maxRunningApps>
> <maxAMShare>-1.0f</maxAMShare>
> <weight>2.0</weight>
> <schedulingPolicy>fair</schedulingPolicy>
> </queue>
> </allocations>
> {code}
> Command:
> {code:java}
> yarn jar
> "./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0-SNAPSHOT.jar" pi
> -Dmapreduce.job.queuename=sample_queue -Dmapreduce.map.resource.gpu=1 1 1000;
> {code}
> The job hangs and the application diagnostic info is empty.
> Given that an exception is thrown before any mapper/reducer container is
> created, the diagnostic message of the AM should be updated.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]