[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle

Jian He (JIRA) Mon, 07 Nov 2016 16:41:21 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15645979#comment-15645979
 ]


Jian He commented on YARN-4597:
-------------------------------

bq. scheduledToRunContainers
sounds good 
bq.  Using shouldLaunchContainer also causes the CONTAINER_LAUNCH event to be 
fired which I do not want.
I think the shouldLaunchContainer is also used for in case container killed has 
been called, skip launching the container. (see the code comment in below code: 
"// Check if the container is signalled to be killed." ). In terms of the 
additional event, I also think that should be changed for existing code. The 
code for sending container_launched event should be inside the 'else' block 
when really launching the container.  
{code}
    // LaunchContainer is a blocking call. We are here almost means the
    // container is launched, so send out the event.
    dispatcher.getEventHandler().handle(new ContainerEvent(
        containerId,
        ContainerEventType.CONTAINER_LAUNCHED));
    context.getNMStateStore().storeContainerLaunched(containerId);

    // Check if the container is signalled to be killed.
    if (!shouldLaunchContainer.compareAndSet(false, true)) {
      LOG.info("Container " + containerId + " not launched as "
          + "cleanup already called");
      return ExitCode.TERMINATED.getExitCode();
{code}
bq. Once queue limit is reached, no new opportunistic containers should also be 
queued. The AM is free to request it again. The MRAppMaster, for eg. 
re-requests the same task as a GUARANTEED container.
sorry for unclear, I meant below code. It will call 'storeContainerQueued' 
unconditionally, even though the container is not queued when it reached the 
queue-len limit.  shouldn't we not call storeContainerQueued in that case ?

{code}
{
      try {
        this.context.getNMStateStore().storeContainerQueued(
            container.getContainerId());
      } catch (IOException e) {
        LOG.warn("Could not store container state into store..", e);
      }
      LOG.info("No available resources for container {} to start its execution "
          + "immediately.", container.getContainerId());
      if (container.getContainerTokenIdentifier().getExecutionType() ==
          ExecutionType.GUARANTEED) {
        queuedGuaranteedContainers.put(container.getContainerId(), container);
        // Kill running opportunistic containers to make space for
        // guaranteed container.
        killOpportunisticContainers(container);
      } else {
        if (queuedOpportunisticContainers.size() <= maxOppQueueLength) {
          LOG.info("Opportunistic container {} will be queued at the NM.",
              container.getContainerId());
          queuedOpportunisticContainers.put(
              container.getContainerId(), container);
        } else {
          LOG.info("Opportunistic container [{}] will not be queued at the NM" +
              "since max queue length [{}] has been reached",
              container.getContainerId(), maxOppQueueLength);
          container.sendKillEvent(
              ContainerExitStatus.KILLED_BY_CONTAINER_SCHEDULER,
              "Opportunistic container queue is full.");
        }
      }
    }
{code}

> Add SCHEDULE to NM container lifecycle
> --------------------------------------
>
>                 Key: YARN-4597
>                 URL: https://issues.apache.org/jira/browse/YARN-4597
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>            Reporter: Chris Douglas
>            Assignee: Arun Suresh
>              Labels: oct16-hard
>         Attachments: YARN-4597.001.patch, YARN-4597.002.patch, 
> YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, 
> YARN-4597.006.patch
>
>
> Currently, the NM immediately launches containers after resource 
> localization. Several features could be more cleanly implemented if the NM 
> included a separate stage for reserving resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle

Reply via email to