[ 
https://issues.apache.org/jira/browse/YARN-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089325#comment-15089325
 ] 

Karam Singh commented on YARN-4565:
-----------------------------------

Came across this issue while experimenting with Fairness in queue with 
CapacityScheduler.
Ecountered a situation when FairOrderingPolicy with SizeBasedWeight is enabled 
on queue in CapacityScheduler, while running GridMix V3 that all queue queue 
resources are consume AMs

Following are setting:
Cluster Total memory capacity 864GB, Global AMResourcePercent=0.1 Global 
MaxApplications=10000, minAllocationMb=2048, AM memory=2048, 
mapMemory=reduceMemory=2048

Queue Settings: 
Capacity=10 
MaxCapacity=80 
UserLimitFactor=8, 
UserLimitPercent=100,
FairOrderingPolicy with SizeBasedWeight=True

According to this at max only 35 AMs can run at a time simultaneously and total 
345 containers can run in queue, 
Which was verified While running GridMixV3 (which submits 760 applications) 
with FairOderingPolicy Only (without  SizeBasedWeight)
While when ran same test with FairOderingPolicy  with  SizeBasedWeight=true, 
345 AMs(applications) running and since all queue resources are used by AMs no 
more containers can run, causing all application to get stuck.

Looks like sizeBasedWeight somehow changes/overrides amResoucePercent.
  

> When sizeBasedWeight enabled for FairOrderingPolicy in CapacityScheduler, 
> Sometimes lead to situation where all queue resources consumed by AMs only
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4565
>                 URL: https://issues.apache.org/jira/browse/YARN-4565
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler, capacityscheduler
>    Affects Versions: 2.8.0, 2.7.1
>            Reporter: Karam Singh
>
> When sizeBasedWeight enabled for FairOrderingPolicy in CapacityScheduler, 
> Sometimes lead to situation where all queue resources consumed by AMs only,
> So from users perpective it appears that all application in queue are stuck, 
> whole queue capacity is comsumed by AMs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to