[
https://issues.apache.org/jira/browse/YARN-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16163703#comment-16163703
]
Arun Suresh edited comment on YARN-5972 at 9/13/17 12:21 AM:
-------------------------------------------------------------
I was not sure if we needed a formal merge vote for this - given that the scope
for this has been slightly reduced. The 3 sub-tasks under this umbrella all
deal with opening up the interfaces and adding methods (which default to
"feature not supported" exceptions) to the abstract {{ContainerExecutor}}.
Most of the changes are in the {{ContainerScheduler}} and some minor changes to
the NM side Container state machines and the NM state store. The feature itself
requires a ContainerExecutor implementation plugged in that can support Pausing
and Thawing, and therefore is OFF by default. Support for the
{{LinuxContainerExecutor}} is being tracked at YARN-6838 but I do not feel it
should block merging this to trunk.
Given the above, I was wondering if it would be ok to just merge the 3 JIRAs
into trunk and branch-2. Do let me know if anyone has any objections to doing
so.
(cc [~jlowe] / [~jianhe] / [~chris.douglas])
was (Author: asuresh):
I was not sure if we needed a formal merge vote for this - given that the scope
for this has been slightly reduced. The 3 sub-tasks under this umbrella all
deal with opening up the interfaces and adding methods (which default to
"feature not supported" exceptions) to the abstract {{ContainerExecutor}}.
Most of the changes are in the {{ContainerScheduler}} and some minor changes to
the NM side Container state machines and the NM state store. The feature itself
requires a ContainerExecutor implementation plugged in that can support Pausing
and Thawing, and therefore is OFF by default. Support for the
{{LinuxContainerExecutor}} is being tracked at YARN-6838 but I do not feel it
should block merging this to trunk.
Given the above, I was wondering if it would be ok to just merge the 3 JIRAs
into trunk and branch-2. Do let me know if anyone has any objections to doing
so.
(cc [~jlowe] / [~jianhe])
> Support Pausing/Freezing of opportunistic containers
> ----------------------------------------------------
>
> Key: YARN-5972
> URL: https://issues.apache.org/jira/browse/YARN-5972
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Hitesh Sharma
> Assignee: Hitesh Sharma
> Attachments: container-pause-resume.pdf
>
>
> YARN-2877 introduced OPPORTUNISTIC containers, and YARN-5216 proposes to add
> capability to customize how OPPORTUNISTIC containers get preempted.
> In this JIRA we propose introducing a PAUSED container state.
> Instead of preempting a running container, the container can be moved to a
> PAUSED state, where it remains until resources get freed up on the node then
> the preempted container can resume to the running state.
> Note that process freezing this is already supported by 'cgroups freezer'
> which is used internally by the docker pause functionality. Windows also has
> OS level support of a similar nature.
> One scenario where this capability is useful is work preservation. How
> preemption is done, and whether the container supports it, is implementation
> specific.
> For instance, if the container is a virtual machine, then preempt call would
> pause the VM and resume would restore it back to the running state.
> If the container executor / runtime doesn't support preemption, then preempt
> would default to killing the container.
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]