[
https://issues.apache.org/jira/browse/YARN-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15784349#comment-15784349
]
Ying Zhang edited comment on YARN-6031 at 12/29/16 3:03 AM:
------------------------------------------------------------
{quote}
Do you think we can make the log message a bit more explicit, i.e. say that the
failure was because node labels have been disabled and point out the property
that the admin should use to disable/enable node labels?
{quote}
Hi [~templedf], the following error message will be printed in RM log:
{noformat}
2016-12-28 01:00:22,694 WARN resourcemanager.RMAppManager
(RMAppManager.java:validateAndCreateResourceRequest(400)) - RM app submission
failed in validating AM resource request for application application_xxxxxx
org.apache.hadoop.yarn.exceptions.InvalidLabelResourceRequestException: Invalid
resource request, node label not enabled but request contains label expression
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:225)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:248)
at
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:396)
at
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:341)
at
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:321)
at
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:439)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1165)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574)
... ...
2016-12-28 01:00:22,694 ERROR resourcemanager.RMAppManager
(RMAppManager.java:recover(455)) - Failed to recover application
application_xxxxxx
{noformat}
The first error message is printed by the check which we fail at the first
place, the second error message is printed by the code in the patch. I'm
thinking this would be enough hint for the root cause.
was (Author: ying zhang):
{quote}
Do you think we can make the log message a bit more explicit, i.e. say that the
failure was because node labels have been disabled and point out the property
that the admin should use to disable/enable node labels?
{quote}
Hi [~templedf], the following error message will be printed in RM log:
{noformat}
2016-12-28 01:00:22,694 WARN resourcemanager.RMAppManager
(RMAppManager.java:validateAndCreateResourceRequest(400)) - RM app submission
failed in validating AM resource request for application
application_1482915192452_0001
org.apache.hadoop.yarn.exceptions.InvalidLabelResourceRequestException: Invalid
resource request, node label not enabled but request contains label expression
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:225)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:248)
at
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:396)
at
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:341)
at
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:321)
at
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:439)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1165)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574)
... ...
2016-12-28 01:00:22,694 ERROR resourcemanager.RMAppManager
(RMAppManager.java:recover(455)) - Failed to recover application
application_1482915192452_0001
{noformat}
> Application recovery failed after disabling node label
> ------------------------------------------------------
>
> Key: YARN-6031
> URL: https://issues.apache.org/jira/browse/YARN-6031
> Project: Hadoop YARN
> Issue Type: Bug
> Components: scheduler
> Affects Versions: 2.8.0
> Reporter: Ying Zhang
> Assignee: Ying Zhang
> Priority: Minor
> Attachments: YARN-6031.001.patch
>
>
> Here is the repro steps:
> Enable node label, restart RM, configure CS properly, and run some jobs;
> Disable node label, restart RM, and the following exception thrown:
> {noformat}
> Caused by:
> org.apache.hadoop.yarn.exceptions.InvalidLabelResourceRequestException:
> Invalid resource request, node label not enabled but request contains label
> expression
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:225)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:248)
> at
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:394)
> at
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:339)
> at
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:319)
> at
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:436)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1165)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574)
> at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> ... 10 more
> {noformat}
> During RM restart, application recovery failed due to that application had
> node label expression specified while node label has been disabled.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]