[ 
https://issues.apache.org/jira/browse/YARN-6325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15924952#comment-15924952
 ] 

Jason Lowe commented on YARN-6325:
----------------------------------

I don't have the full backstory on queue name requirements, but I agree it 
seems like a bug given the ambiguity in some APIs.  However since most of the 
user-facing APIs are only used for leaf queues, I can see how the parent/leaf 
conflict potential was missed.  Most of the APIs are only used for leaf queues 
since that's where the apps actually run.

I do worry that if we suddenly enforce something we didn't before that we would 
break some user's long-standing setup.  Seems like something we should fix for 
3.x going forward, but not sure it's worth the compatibility risk in 2.x.  
Thoughts?

> ParentQueue and LeafQueue with same name can cause queue name based 
> operations to fail
> --------------------------------------------------------------------------------------
>
>                 Key: YARN-6325
>                 URL: https://issues.apache.org/jira/browse/YARN-6325
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>            Reporter: Jonathan Hung
>         Attachments: capacity-scheduler.xml, Screen Shot 2017-03-13 at 
> 2.28.30 PM.png
>
>
> For example, configure capacity scheduler with two leaf queues: {{root.a.a1}} 
> and {{root.b.a}}, with {{yarn.scheduler.capacity.root.queues}} as {{b,a}} (in 
> that order).
> Then add a mapping e.g. {{u:username:a}} to {{capacity-scheduler.xml}} and 
> call {{refreshQueues}}. Operation fails with {noformat}refreshQueues: 
> java.io.IOException: Failed to re-init queues : mapping contains invalid or 
> non-leaf queue a
>       at 
> org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.logAndWrapException(AdminService.java:866)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:391)
>       at 
> org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshQueues(ResourceManagerAdministrationProtocolPBServiceImpl.java:114)
>       at 
> org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:271)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>       at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:867)
>       at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:813)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2653)
> Caused by: java.io.IOException: Failed to re-init queues : mapping contains 
> invalid or non-leaf queue a
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:404)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:396)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:386)
>       ... 10 more
> Caused by: java.io.IOException: mapping contains invalid or non-leaf queue a
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:547)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:571)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitializeQueues(CapacityScheduler.java:595)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:400)
>       ... 12 more
> {noformat}
> Part of the issue is that the {{queues}} map in 
> {{CapacitySchedulerQueueManager}} stores queues by queue name. We could do 
> one of a few things:
> # Disallow ParentQueues and LeafQueues to have the same queue name. (this 
> breaks compatibility)
> # Store queues by queue path instead of queue name. But this might require 
> changes in lots of places, e.g. in this case the queue-mappings would have to 
> map to a queue path instead of a queue name (which also breaks compatibility)
> and possibly others.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to