Anand Srinivasan created YARN-10458:
---------------------------------------

             Summary: Hive On Tez queries fails upon submission to dynamically 
created pools
                 Key: YARN-10458
                 URL: https://issues.apache.org/jira/browse/YARN-10458
             Project: Hadoop YARN
          Issue Type: Bug
          Components: resourcemanager
            Reporter: Anand Srinivasan


Recently, one of our customers created dynamic queues based on placement rules 
in CDP Private Cloud Base 71.2 to run their Hive on Tez queries but the job 
failed because of not submitting to the appropriate queue.

Analyzing the Resource Manager log, we could see that the queue creation fails 
because ACL submit application check couldn't succeed.

We tried setting acl_submit_applications to '*' for managed parent queues. For 
static queues, this worked but failed for dynamic queues. Also tried setting 
the below property but it didn't help either.
yarn.scheduler.capacity.root.parent-queue-name.leaf-queue-template.acl_submit_applications=*.

RM error log shows the following :

2020-09-18 01:08:40,579 INFO 
org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule:
 Application application_1600399068816_0460 user user1 mapping [default] to 
[queue1] override false
2020-09-18 01:08:40,579 WARN 
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: User 'user1' from 
application tag does not have access to  queue 'user1'. The placement is done 
for user 'hive'
 

Checking the code, scheduler#checkAccess() bails out even before checking the 
ACL permissions for that particular queue because the CSQueue is null.

public boolean checkAccess(UserGroupInformation callerUGI,
QueueACL acl, String queueName) {
CSQueue queue = getQueue(queueName);
if (queue == null) {
if (LOG.isDebugEnabled())

{ LOG.debug("ACL not found for queue access-type " + acl + " for queue " + 
queueName); }
return false;                    *<-- the method returns false here.*
}
return queue.hasAccess(acl, callerUGI);
}

As this is an auto created queue, CSQueue may be null in this case. May be 
scheduler#checkAccess() should have a logic to differentiate when CSQueue is 
null and if queue mapping is involved and if so, check if the parent queue 
exists and is a managed parent and if so, check if the parent queue has valid 
ACL's instead of returning false ?

Thanks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to