[
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arun Suresh updated YARN-4597:
------------------------------
Attachment: YARN-4597.006.patch
Rebased patch after YARN-2995 and latest commits to trunk.
[~jianhe], thanks for the comments..
bq. maybe rename ContainerScheduler#runningContainers to scheduledContainers
Given that we the SCHEDULED state is a state that comes before RUNNING. and the
*runningContainers* collection actually holds containers that have been marked
to run by the scheduler, am thinking scheduled containers may not be apt here.
How about *scheduledToRunContainers* ?
bq. The ContainerLaunch#killedBeforeStart flag, looks like the exising flag
'shouldLaunchContainer' serves the same purpose, can we reuse that ? if so, the
container#isMarkedToKill is also not needed.
Hmm.. my understanding is that it a slightly different. The
*shouldLaunchContainer* is IIUC, used during the recovery process to signal
that the container should be launched or not. What *killedBeforeStart* aims to
do is to notify the *ContainerLaunch* (which runs in a different thread) that
the Scheduler might have requested to start the container earlier, but in the
last minute decided to kill it. Using shouldLaunchContainer also causes the
CONTAINER_LAUNCH event to be fired which I do not want.
bq. NodeManager#containerScheduler variable not used, remove
Done
bq. I think this comment is not addressed ? "In case we exceed the max-queue
length, we are killing the container directly instead of queueing the
container, in this case, we should not store the container as queued?"
Yeah.. meant to comment on it. This is actually the desired behavior. Once
queue limit is reached, no new opportunistic containers should also be queued.
The AM is free to request it again. The MRAppMaster, for eg. re-requests the
same task as a GUARANTEED container.
Hope this made sense ?
> Add SCHEDULE to NM container lifecycle
> --------------------------------------
>
> Key: YARN-4597
> URL: https://issues.apache.org/jira/browse/YARN-4597
> Project: Hadoop YARN
> Issue Type: New Feature
> Components: nodemanager
> Reporter: Chris Douglas
> Assignee: Arun Suresh
> Labels: oct16-hard
> Attachments: YARN-4597.001.patch, YARN-4597.002.patch,
> YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch,
> YARN-4597.006.patch
>
>
> Currently, the NM immediately launches containers after resource
> localization. Several features could be more cleanly implemented if the NM
> included a separate stage for reserving resources.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]