[
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568792#comment-15568792
]
Arun Suresh commented on YARN-4597:
-----------------------------------
Thanks for taking a look [~jianhe],
bq. Wondering why KillWhileExitingTransition is added..
I had put it in there for debugging something... Left it there since it thought
its harmless... but, yeah looks like it does over-ride the exitcode. Will
remove it. Good catch.
* w.r.t {{ContainerState#SCHEDULED}} : Actually, I think we should expose this.
We currently club NEW, LOCALIZING, LOCALIZED etc. into RUNNING, but the
container is actually not running, and is thus misleading. SCHEDULED implies
that some of the containers dependencies (resources for localization + some
internal queuing/scheduling policy) have not yet been met.
Prior to this, YARN-2877 had introduced the QUEUED return state. This would be
visible to applications, if Queuing was enabled. This patch technically just
renames QUEUED to SCHEDULED. Also, all containers will go thru the SCHEDULED
state, not just the opportunistic ones (although, for guaranteed containers
this will just be a pass-thru state)
Another thing I was hoping for some input was, currently, the
{{ContainerScheduler}} runs in the same thread as the ContainerManager's
AsyncDispatcher started by the ContainerManager. Also, the Scheduler is
triggered only by events. I was wondering if there is any merit pushing these
events into a blocking queue as they arrive and have a separate thread take
care of them. This will preserve the serial nature of operation (and thereby
keep the code simple by not needing synchronized collections) and will not hold
up the dispatcher from delivering other events while the scheduler is
scheduling.
A minor disadvantage, is that the NM will probably consume a thread that for
the most part will be blocked on the queue. This thread could be used by one of
the containers.
> Add SCHEDULE to NM container lifecycle
> --------------------------------------
>
> Key: YARN-4597
> URL: https://issues.apache.org/jira/browse/YARN-4597
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Reporter: Chris Douglas
> Assignee: Arun Suresh
> Attachments: YARN-4597.001.patch, YARN-4597.002.patch
>
>
> Currently, the NM immediately launches containers after resource
> localization. Several features could be more cleanly implemented if the NM
> included a separate stage for reserving resources.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]