Sunil G commented on YARN-4479:

Thanks [~Naganarasimha Garla] fr the comments.

bq.This patch tries to activate all applications which were running before RM 
restart happened
Being said this, yes its definitely depends on available AM limit after restart 
(I meant the positive case in my earlier comment where all cluster resource 
were available). 
I did think about the case when some NMs are not registered back, and limit is 
lesser. In that case, we will have app-A1 pending in the list to get activated. 
And this application will be the one which will be activated first if any space 
is available.  This ensures that high priority apps which were in the pending 
list will get containers, and app-A1 which were low in priority will wait. Even 
though A1 is activated, it has to wait till other high priority apps are done 
with its request. So A1 in pending list is may be fine provided other apps are 
completed sooner or failed NMs are up. But I am not saying its correct. Its 
debatable and I think with discussion we can conclude the approach here.

Also abt {{All containers which were running earlier will still continue}}, I 
meant about the live containers of apps which were running prior to restart. 
After restart, even for the pending apps (apps like A1) as mentioned in ur 
scenario, its running containers wont be killed. Am I missing something?

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> -------------------------------------------------------------------------------
>                 Key: YARN-4479
>                 URL: https://issues.apache.org/jira/browse/YARN-4479
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, resourcemanager
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>         Attachments: 0001-YARN-4479.patch
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery

This message was sent by Atlassian JIRA

Reply via email to