[ 
https://issues.apache.org/jira/browse/YARN-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16163703#comment-16163703
 ] 

Arun Suresh commented on YARN-5972:
-----------------------------------

I was not sure if we needed a formal merge vote for this - given that the scope 
for this has been slightly reduced. The 3 sub-tasks under this umbrella all 
deal with opening up the interfaces and adding methods (which default to 
"feature not supported" exceptions) to the abstract ContainerExecutor.
Most of the changes are in the ContainerScheduler and some minor changes to the 
NM side Container state machines and the NM state store. The feature itself 
requires a ContainerExecutor implementation plugged in that can support Pausing 
and Thawing, and therefore is OFF by default. Support for the 
LinuxContainerExecutor is being tracked at YARN-6838 but I do not feel it 
should block merging this to trunk.
Given the above, I was wondering if it would be ok to just merge the 3 JIRAs 
into trunk and branch-2. Do let me know if anyone has any objections to doing 
so.
(cc [~jlowe] / [~jianhe])

> Support Pausing/Freezing of opportunistic containers
> ----------------------------------------------------
>
>                 Key: YARN-5972
>                 URL: https://issues.apache.org/jira/browse/YARN-5972
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Hitesh Sharma
>            Assignee: Hitesh Sharma
>         Attachments: container-pause-resume.pdf
>
>
> YARN-2877 introduced OPPORTUNISTIC containers, and YARN-5216 proposes to add 
> capability to customize how OPPORTUNISTIC containers get preempted.
> In this JIRA we propose introducing a PAUSED container state.
> Instead of preempting a running container, the container can be moved to a 
> PAUSED state, where it remains until resources get freed up on the node then 
> the preempted container can resume to the running state.
> Note that process freezing this is already supported by 'cgroups freezer' 
> which is used internally by the docker pause functionality. Windows also has 
> OS level support of a similar nature.
> One scenario where this capability is useful is work preservation. How 
> preemption is done, and whether the container supports it, is implementation 
> specific.
> For instance, if the container is a virtual machine, then preempt call would 
> pause the VM and resume would restore it back to the running state.
> If the container executor / runtime doesn't support preemption, then preempt 
> would default to killing the container. 
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to