[ 
https://issues.apache.org/jira/browse/YARN-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2918:
-----------------------------
    Description: 
Currently, if admin setup labels on queues 
{{<queue-path>.accessible-node-labels = ...}}. And the label is not added to 
RM, queue's initialization will fail and RM will fail too:
{noformat}
2014-12-03 20:11:50,126 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
ResourceManager
...
Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
please check.
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.<init>(AbstractCSQueue.java:109)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.<init>(LeafQueue.java:120)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
{noformat}

This is not a good user experience, we should stop fail RM so that admin can 
configure queue/labels in following steps:
- Configure queue (with label)
- Start RM
- Add labels to RM
- Submit applications

Now admin has to:
- Configure queue (without label)
- Start RM
- Add labels to RM
- Refresh queue's config (with label)
- Submit applications

  was:
I configured accessible-node-labels to queue. But RM startup fails with below 
exception. I see current steps to configure NodeLabel is first need to add via 
rmadmin and later need to configure for queues. But it will be good if both 
cluster and queue node labels has consitency in configuring it. 
{noformat}
2014-12-03 20:11:50,126 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
ResourceManager
org.apache.hadoop.service.ServiceStateException: java.io.IOException: 
NodeLabelManager doesn't include label = x, please check.
        at 
org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
        at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:556)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:982)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:249)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1203)
Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
please check.
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.<init>(AbstractCSQueue.java:109)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.<init>(LeafQueue.java:120)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
        at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
{noformat}


> Don't fail RM if queue's configured labels are not existed in 
> cluster-node-labels
> ---------------------------------------------------------------------------------
>
>                 Key: YARN-2918
>                 URL: https://issues.apache.org/jira/browse/YARN-2918
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Rohith
>            Assignee: Wangda Tan
>
> Currently, if admin setup labels on queues 
> {{<queue-path>.accessible-node-labels = ...}}. And the label is not added to 
> RM, queue's initialization will fail and RM will fail too:
> {noformat}
> 2014-12-03 20:11:50,126 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
> ResourceManager
> ...
> Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
> please check.
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.<init>(AbstractCSQueue.java:109)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.<init>(LeafQueue.java:120)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
>       at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> {noformat}
> This is not a good user experience, we should stop fail RM so that admin can 
> configure queue/labels in following steps:
> - Configure queue (with label)
> - Start RM
> - Add labels to RM
> - Submit applications
> Now admin has to:
> - Configure queue (without label)
> - Start RM
> - Add labels to RM
> - Refresh queue's config (with label)
> - Submit applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to