Arun Suresh commented on YARN-4597:

Thanks for taking a look [~jianhe],

bq. Wondering why KillWhileExitingTransition is added..
I had put it in there for debugging something... Left it there since it thought 
its harmless... but, yeah looks like it does over-ride the exitcode. Will 
remove it. Good catch.

* w.r.t {{ContainerState#SCHEDULED}} : Actually, I think we should expose this. 
We currently club NEW, LOCALIZING, LOCALIZED etc. into RUNNING, but the 
container is actually not running, and is thus misleading. SCHEDULED implies 
that some of the containers dependencies (resources for localization + some 
internal queuing/scheduling policy) have not yet been met.
Prior to this, YARN-2877 had introduced the QUEUED return state. This would be 
visible to applications, if Queuing was enabled. This patch technically just 
renames QUEUED to SCHEDULED. Also, all containers will go thru the SCHEDULED 
state, not just the opportunistic ones (although, for guaranteed containers 
this will just be a pass-thru state)

Another thing I was hoping for some input was, currently, the 
{{ContainerScheduler}} runs in the same thread as the ContainerManager's 
AsyncDispatcher started by the ContainerManager. Also, the Scheduler is 
triggered only by events. I was wondering if there is any merit pushing these 
events into a blocking queue as they arrive and have a separate thread take 
care of them. This will preserve the serial nature of operation (and thereby 
keep the code simple by not needing synchronized collections) and will not hold 
up the dispatcher from delivering other events while the scheduler is 
A minor disadvantage, is that the NM will probably consume a thread that for 
the most part will be blocked on the queue. This thread could be used by one of 
the containers.

> Add SCHEDULE to NM container lifecycle
> --------------------------------------
>                 Key: YARN-4597
>                 URL: https://issues.apache.org/jira/browse/YARN-4597
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Chris Douglas
>            Assignee: Arun Suresh
>         Attachments: YARN-4597.001.patch, YARN-4597.002.patch
> Currently, the NM immediately launches containers after resource 
> localization. Several features could be more cleanly implemented if the NM 
> included a separate stage for reserving resources.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to