zhengchenyu created YARN-6568:
---------------------------------

             Summary: A queue which runs a long time job couldn't acquire any 
container for long time.
                 Key: YARN-6568
                 URL: https://issues.apache.org/jira/browse/YARN-6568
             Project: Hadoop YARN
          Issue Type: Bug
          Components: fairscheduler
    Affects Versions: 2.7.1
         Environment: CentOS 7.1
            Reporter: zhengchenyu
             Fix For: 2.7.4


In our cluster, we find some applications couldn't acquire any container for 
long time. (Note: we use FairSharePolicy and FairScheduler)

First, I found some unreasonable configuration, we set minRes=maxRes. So some 
application keep pending for long time, we kill some large applicaiton to solve 
this problem. Then we changed this configuration, this problem relieves. 

But this problem is not completely solved. In our cluster, I found applications 
in  some queue which request few container keep pending for long time. 

I simulate in test cluster. I submit DistributedShell application which run 
many loo applications to queueA, then I submit my own yarn application which 
request container and release container constantly to queueB.  At this time, 
any applicaitons which are submmited to queueA keep pending!

We know this is the problem of FairSharePolicy, it consider the request of 
queue. So after sort the queues, some queues which have few request are ordered 
last all time.

We know if the AM container is launched, then the request will increase, But 
FairSharePolicy can't distinguish which request is AM request. I think if am 
container is assigned, the problem is solved. 

Our companion discuss this problem. we recommend set a timeout for queue, it 
means the time length of a queue is not assigned. If timeout, we set this queue 
to the first place of queues list. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to