[ 
https://issues.apache.org/jira/browse/YARN-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046590#comment-14046590
 ] 

Wangda Tan commented on YARN-1408:
----------------------------------

Hi [~sunilg],
Thanks for working out this patch so fast!

*A major problems I've seen.*
ResourceRequest stored in RMContainerImpl should include rack/any RR,
Currently, there's only one ResourceRequest stored in RMContainerImpl, this may 
should not enough for recovering in following cases:
Case 1: RR may contain other fields like relaxLocality, etc. Assume a RR is 
node-local, the relaxLocaity=true (default), and it's rack-local/any RR's 
relaxLocality=false. In your current implementation, you cannot fully recover 
original RRs.
Case 2: Rack-local RR will be missing. Assume a RR is node-local, when do 
resource allocation, the outstanding rack-local/any numContainer will be 
decreased, you can check AppSchedulingInfo#allocateNodeLocal for the logic of 
how outstanding rack/any #containers decreased.

*My thoughts about how to implement this is:*
In FiCaScheduler#allocate, appSchedulingInfo.allocate will be invoked. You can 
edit appSchedulingInfo.allocate to return a list a RRs, include node/rack/any 
if possible.
Pass such RRs to RMContainerImpl

And could you please elaborate on this?
bq. AM would have asked for NodeLocal in another Hosts, which may not be able 
to recover.

Does it make sense to you?  I'll review minor issues and test cases in next 
cycle.

Thanks,
Wangda

> Preemption caused Invalid State Event: ACQUIRED at KILLED and caused a task 
> timeout for 30mins
> ----------------------------------------------------------------------------------------------
>
>                 Key: YARN-1408
>                 URL: https://issues.apache.org/jira/browse/YARN-1408
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.2.0
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: Yarn-1408.1.patch, Yarn-1408.2.patch, Yarn-1408.3.patch, 
> Yarn-1408.4.patch, Yarn-1408.5.patch, Yarn-1408.patch
>
>
> Capacity preemption is enabled as follows.
>  *  yarn.resourcemanager.scheduler.monitor.enable= true ,
>  *  
> yarn.resourcemanager.scheduler.monitor.policies=org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy
> Queue = a,b
> Capacity of Queue A = 80%
> Capacity of Queue B = 20%
> Step 1: Assign a big jobA on queue a which uses full cluster capacity
> Step 2: Submitted a jobB to queue b  which would use less than 20% of cluster 
> capacity
> JobA task which uses queue b capcity is been preempted and killed.
> This caused below problem:
> 1. New Container has got allocated for jobA in Queue A as per node update 
> from an NM.
> 2. This container has been preempted immediately as per preemption.
> Here ACQUIRED at KILLED Invalid State exception came when the next AM 
> heartbeat reached RM.
> ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> ACQUIRED at KILLED
> This also caused the Task to go for a timeout for 30minutes as this Container 
> was already killed by preemption.
> attempt_1380289782418_0003_m_000000_0 Timed out after 1800 secs



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to