[
https://issues.apache.org/jira/browse/YARN-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15718618#comment-15718618
]
Arun Suresh commented on YARN-5292:
-----------------------------------
Thanks for the patch Hitesh, couple of comments:
* {{ContainerExecutor}}, maybe default behavior should not be to throw an
Exception. We should probably LOG.warn() too.
* {{ContainerImpl}}, In a couple of places, you can maybe collapse a bunch of
transitions like this :
{noformat}
.addTransition(ContainerState.KILLING,
ContainerState.KILLING,
ContainerEventType.CONTAINER_LAUNCHED)
.addTransition(ContainerState.KILLING,
ContainerState.KILLING,
ContainerEventType.PAUSE_CONTAINER)
{noformat}
into
{noformat}
.addTransition(ContainerState.KILLING,
ContainerState.KILLING,
EnumSet.of(ContainerEventType.CONTAINER_LAUNCHED,
ContainerEventType.PAUSE_CONTAINER)
{noformat}
* It looks like when a container is REINITIALIZING, and it receives a PAUSE
event, you are killing the container… Think it might be better to re-queue the
container somehow in this case - so the scheduler can restart it when there is
available resources.
* I was thinking PAUSED and RESUMING should be notified to the RM as SCHEDULED
itself. SCHEDULED should be used signify that the container allocation is
secure, but is not running.
> Support for PAUSED container state
> ----------------------------------
>
> Key: YARN-5292
> URL: https://issues.apache.org/jira/browse/YARN-5292
> Project: Hadoop YARN
> Issue Type: New Feature
> Reporter: Hitesh Sharma
> Assignee: Hitesh Sharma
> Attachments: YARN-5292.001.patch, YARN-5292.002.patch,
> YARN-5292.003.patch, YARN-5292.004.patch, yarn-5292.pdf
>
>
> YARN-2877 introduced OPPORTUNISTIC containers, and YARN-5216 proposes to add
> capability to customize how OPPORTUNISTIC containers get preempted.
> In this JIRA we propose introducing a PAUSED container state.
> When a running container gets preempted, it enters the PAUSED state, where it
> remains until resources get freed up on the node then the preempted container
> can resume to the running state.
>
> One scenario where this capability is useful is work preservation. How
> preemption is done, and whether the container supports it, is implementation
> specific.
> For instance, if the container is a virtual machine, then preempt would pause
> the VM and resume would restore it back to the running state.
> If the container doesn't support preemption, then preempt would default to
> killing the container.
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]