[ 
https://issues.apache.org/jira/browse/YARN-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15651964#comment-15651964
 ] 

Daniel Templeton commented on YARN-5774:
----------------------------------------

bq. So if users misconfigure the increment resource in fair scheduler, a 
detailed error message will show up.

That's great for fair scheduler, but what about CS and FIFO?  I'm particularly 
worried about those because an increment of 0 was not previously treated as 
invalid.  What's the intent of throwing the exception in {{normalize()}}?  You 
throw an exception when you want to halt the execution flow and allow some 
out-of-sequence remedial code to run.  In this case, no one is expecting to see 
this exception, so no one will catch it, and there's no action that needs to be 
taken in the CS and FIFO cases.  The net result is that things will fail in 
indirect ways.  Since it's not a failure for CS and FIFO, I don't think you 
should throw the exception.  In the case of FS, as you point out, the only 
reason to hit the exception is misuse of the resource calculator.

> MR Job stuck in ACCEPTED status without any progress in Fair Scheduler if set 
> yarn.scheduler.minimum-allocation-mb to 0.
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-5774
>                 URL: https://issues.apache.org/jira/browse/YARN-5774
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Yufei Gu
>            Assignee: Yufei Gu
>              Labels: oct16-easy
>         Attachments: YARN-5774.001.patch, YARN-5774.002.patch, 
> YARN-5774.003.patch, YARN-5774.004.patch
>
>
> MR Job stuck in ACCEPTED status without any progress in Fair Scheduler 
> because there is no resource request for the AM. This happened when you 
> configure {{yarn.scheduler.minimum-allocation-mb}} to zero.
> The problem is in the code used by both Capacity Scheduler and Fair 
> Scheduler. {{scheduler.increment-allocation-mb}} is a concept in FS, but not 
> CS. So the common code in class RMAppManager passes the 
> {{yarn.scheduler.minimum-allocation-mb}} as incremental one because there is 
> no incremental one for CS when it tried to normalize the resource requests.
> {code}
>      SchedulerUtils.normalizeRequest(amReq, scheduler.getResourceCalculator(),
>           scheduler.getClusterResource(),
>           scheduler.getMinimumResourceCapability(),
>           scheduler.getMaximumResourceCapability(),
>           scheduler.getMinimumResourceCapability());  --> incrementResource 
> should be passed here.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to