[ 
https://issues.apache.org/jira/browse/YARN-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15743071#comment-15743071
 ] 

Arun Suresh commented on YARN-5216:
-----------------------------------

Just clarifying my position:
* The ability to PAUSE a process is something the environment (executor + 
runtime + OS) on which the NM is running provides.
* W.r.t the NM/ContainerScheduler, if it finds that an opportunistic container 
is using resources that can be given to a waiting guaranteed container, it just 
needs the underlying environment (executor + runtime + OS) to reclaim those 
resources so it can start the guaranteed container.
* HOW the environment does this should not really matter to the 
NM/ContainerScheduler, it just needs a callback from the launcher/executor that 
the container in question was killed or Paused, to signal if it should place it 
back in the queue or not.
* IMO, I don't see a case where the NM/Scheduler should prefer one over the 
other, which is the reason why I suggested a common *preempt* method (am not 
particular of the name, we can call it *reclaimResources* maybe) in the 
Executor. If later we do find a case, there is already the *kill* method 
anyway. I infact feel the executor should first try to pause, and if not 
supported, only then kill the opportunistic container.

Given the above, and the fact that most large deployments use their own 
specialized Executor class / binary anyway (I pointed to YARN-5673, since there 
is discussion of having dynamically loaded modules / separate binaries in the 
default distribution based on flags passed during build as well), I was 
wondering why expose a yarn-site.xml knob in the first place.

I am fine making this decision later too. For the time-being, consider adding 
*reclaimResources* to the ContainerExecutor, which based on some configuration, 
delegates to kill or pause.

> Expose configurable preemption policy for OPPORTUNISTIC containers running on 
> the NM
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-5216
>                 URL: https://issues.apache.org/jira/browse/YARN-5216
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: distributed-scheduling
>            Reporter: Arun Suresh
>            Assignee: Hitesh Sharma
>              Labels: oct16-hard
>         Attachments: YARN-5216-YARN-5972.001.patch, YARN5216.001.patch, 
> yarn5216.002.patch
>
>
> Currently, the default action taken by the QueuingContainerManager, 
> introduced in YARN-2883, when a GUARANTEED Container is scheduled on an NM 
> with OPPORTUNISTIC containers using up resources, is to KILL the running 
> OPPORTUNISTIC containers.
> This JIRA proposes to expose a configurable hook to allow the NM to take a 
> different action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to