[ 
https://issues.apache.org/jira/browse/YARN-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224208#comment-15224208
 ] 

Junping Du commented on YARN-4679:
----------------------------------

bq. Clearly, we should handle the NM resize (especially shrink) very carefully.
YARN-291 works mostly on RM side scheduling. YARN-4832 notify new resource back 
to NM though.
Currently, for NM resource shrink, RM only adjust its scheduling decision but 
not affect existing running containers. Yes. It could cause resource 
over-commitment case. Tracked by YARN-999. RM should send container preemption 
to NM when current resource < consumed resource.

> When work-preserving restart is enabled, the scheduler should wait for the 
> earlier of recovery completion and configured wait time
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4679
>                 URL: https://issues.apache.org/jira/browse/YARN-4679
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Karthik Kambatla
>
> When work-preserving restart is enabled, it appears the restart (or failover) 
> is unconditionally blocked for the configured delay even if the recovery 
> itself finishes sooner than this. This should be updated to wait for the 
> earlier of the two conditions. Also, it would be nice to allow setting the 
> config to -1 to indicate wait as long as need for the recovery to be 
> completed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to