[ 
https://issues.apache.org/jira/browse/YARN-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004604#comment-16004604
 ] 

zhengchenyu commented on YARN-6568:
-----------------------------------

I solve this problem, when I set the timeout for queue. 
In my first version, the timeout of every queue are same. I think it is 
necessary to set different timeout for queues. By this, we could accelerate 
specified queue! 

> A queue which runs a long time job couldn't acquire any container for long 
> time.
> --------------------------------------------------------------------------------
>
>                 Key: YARN-6568
>                 URL: https://issues.apache.org/jira/browse/YARN-6568
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.7.1
>         Environment: CentOS 7.1
>            Reporter: zhengchenyu
>             Fix For: 2.7.4
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> In our cluster, we find some applications couldn't acquire any container for 
> long time. (Note: we use FairSharePolicy and FairScheduler)
> First, I found some unreasonable configuration, we set minRes=maxRes. So some 
> application keep pending for long time, we kill some large applicaiton to 
> solve this problem. Then we changed this configuration, this problem 
> relieves. 
> But this problem is not completely solved. In our cluster, I found 
> applications in  some queue which request few container keep pending for long 
> time. 
> I simulate in test cluster. I submit DistributedShell application which run 
> many loo applications to queueA, then I submit my own yarn application which 
> request container and release container constantly to queueB.  At this time, 
> any applicaitons which are submmited to queueA keep pending!
> We know this is the problem of FairSharePolicy, it consider the request of 
> queue. So after sort the queues, some queues which have few request are 
> ordered last all time.
> We know if the AM container is launched, then the request will increase, But 
> FairSharePolicy can't distinguish which request is AM request. I think if am 
> container is assigned, the problem is solved. 
> Our companion discuss this problem. we recommend set a timeout for queue, it 
> means the time length of a queue is not assigned. If timeout, we set this 
> queue to the first place of queues list. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to