[ https://issues.apache.org/jira/browse/YARN-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15743071#comment-15743071 ]
Arun Suresh commented on YARN-5216: ----------------------------------- Just clarifying my position: * The ability to PAUSE a process is something the environment (executor + runtime + OS) on which the NM is running provides. * W.r.t the NM/ContainerScheduler, if it finds that an opportunistic container is using resources that can be given to a waiting guaranteed container, it just needs the underlying environment (executor + runtime + OS) to reclaim those resources so it can start the guaranteed container. * HOW the environment does this should not really matter to the NM/ContainerScheduler, it just needs a callback from the launcher/executor that the container in question was killed or Paused, to signal if it should place it back in the queue or not. * IMO, I don't see a case where the NM/Scheduler should prefer one over the other, which is the reason why I suggested a common *preempt* method (am not particular of the name, we can call it *reclaimResources* maybe) in the Executor. If later we do find a case, there is already the *kill* method anyway. I infact feel the executor should first try to pause, and if not supported, only then kill the opportunistic container. Given the above, and the fact that most large deployments use their own specialized Executor class / binary anyway (I pointed to YARN-5673, since there is discussion of having dynamically loaded modules / separate binaries in the default distribution based on flags passed during build as well), I was wondering why expose a yarn-site.xml knob in the first place. I am fine making this decision later too. For the time-being, consider adding *reclaimResources* to the ContainerExecutor, which based on some configuration, delegates to kill or pause. > Expose configurable preemption policy for OPPORTUNISTIC containers running on > the NM > ------------------------------------------------------------------------------------ > > Key: YARN-5216 > URL: https://issues.apache.org/jira/browse/YARN-5216 > Project: Hadoop YARN > Issue Type: Sub-task > Components: distributed-scheduling > Reporter: Arun Suresh > Assignee: Hitesh Sharma > Labels: oct16-hard > Attachments: YARN-5216-YARN-5972.001.patch, YARN5216.001.patch, > yarn5216.002.patch > > > Currently, the default action taken by the QueuingContainerManager, > introduced in YARN-2883, when a GUARANTEED Container is scheduled on an NM > with OPPORTUNISTIC containers using up resources, is to KILL the running > OPPORTUNISTIC containers. > This JIRA proposes to expose a configurable hook to allow the NM to take a > different action. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org