[ 
https://issues.apache.org/jira/browse/YARN-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PengZhang updated YARN-1829:
----------------------------

    Description: 
CapacityScheduler will validate new configuration to make sure all existing 
queues are still present. But it seems not enough:
1.When we change one queue(name A) from leaf to parent, it will pass validation 
and add it's new child(X) to queues. And later root.reinitialize() will fail 
because of queue type has changed.
2.Then we add new parent queue(name B) with children(X), and change queue(A)'s 
state to STOPPED. This will apply successfully. but job submitted to queue(X) 
can never be scheduled. Because LeafQueue(X) has already been added in phase 1, 
and it's parent points to A which is STOPPED. 

 root   
 /   
A     
queues: root, A


  root  
  /
 A
/
X
reinitialize failed, but X is added to queues
queues: root, A, X

  root 
  / \
 A   B
&nbsp&nbsp  \
&nbsp&nbsp  X    
new node X will not replace old one
queues: root, A, X(value is not LeafQueue that in the tree)

  was:
CapacityScheduler will validate new configuration to make sure all existing 
queues are still present. But it seems not enough:
1.When we change one queue(name A) from leaf to parent, it will pass validation 
and add it's new child(X) to queues. And later root.reinitialize() will fail 
because of queue type has changed.
2.Then we add new parent queue(name B) with children(X), and change queue(A)'s 
state to STOPPED. This will apply successfully. but job submitted to queue(X) 
can never be scheduled. Because LeafQueue(X) has already been added in phase 1, 
and it's parent points to A which is STOPPED. 

 root   
 /   
A     
queues: root, A


  root  
  /
 A
/
X
reinitialize failed, but X is added to queues
queues: root, A, X

  root 
  / \
 A   B
        \
         X    
new node X will not replace old one
queues: root, A, X(value is not LeafQueue that in the tree)


> CapacityScheduler can't schedule job after misconfiguration
> -----------------------------------------------------------
>
>                 Key: YARN-1829
>                 URL: https://issues.apache.org/jira/browse/YARN-1829
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>            Reporter: PengZhang
>
> CapacityScheduler will validate new configuration to make sure all existing 
> queues are still present. But it seems not enough:
> 1.When we change one queue(name A) from leaf to parent, it will pass 
> validation and add it's new child(X) to queues. And later root.reinitialize() 
> will fail because of queue type has changed.
> 2.Then we add new parent queue(name B) with children(X), and change 
> queue(A)'s state to STOPPED. This will apply successfully. but job submitted 
> to queue(X) can never be scheduled. Because LeafQueue(X) has already been 
> added in phase 1, and it's parent points to A which is STOPPED. 
>  root   
>  /   
> A     
> queues: root, A
>   root  
>   /
>  A
> /
> X
> reinitialize failed, but X is added to queues
> queues: root, A, X
>   root 
>   / \
>  A   B
> &nbsp&nbsp  \
> &nbsp&nbsp  X    
> new node X will not replace old one
> queues: root, A, X(value is not LeafQueue that in the tree)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to