Phil D'Amore created YARN-2726:
----------------------------------

             Summary: CapacityScheduler should explicitly log when an 
accessible label has no capacity
                 Key: YARN-2726
                 URL: https://issues.apache.org/jira/browse/YARN-2726
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: capacityscheduler
            Reporter: Phil D'Amore
            Priority: Minor


Given:

- Node label defined: test-label
- Two queues defined: a, b
- label accessibility and and capacity defined as follows (properties 
abbreviated for readability):

root.a.accessible-node-labels = test-label
root.a.accessible-node-labels.test-label.capacity = 100

If you restart the RM or do a 'rmadmin -refreshQueues' you will get a stack 
trace with the following error buried within:

"Illegal capacity of -1.0 for label=test-label in queue=root.b"

This of course occurs because test-label is accessible to b due to inheritance 
from the root, and -1 is the UNDEFINED value.  To my mind this might not be 
obvious to the admin, and the error message which results does not help guide 
someone to the source of the issue.

I propose that this situation be updated so that when the capacity on an 
accessible label is undefined, it is explicitly called out instead of falling 
through to the illegal capacity check.  Something like:

{code}
if (capacity == UNDEFINED) {
    throw new IllegalArgumentException("Configuration issue: " + " label=" + 
label + " is accessible from queue=" + queue + " but has no capacity set.");
}
{code}

I'll leave it to better judgement than mine as to whether I'm throwing the 
appropriate exception there.  I think this check should be added to both 
getNodeLabelCapacities and getMaximumNodeLabelCapacities in 
CapacitySchedulerConfiguration.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to