[ 
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15658402#comment-15658402
 ] 

Konstantinos Karanasos commented on YARN-4597:
----------------------------------------------

Here are some comments on the {{ContainerScheduler}}:

- {{queuedOpportunisticContainers}} will have concurrency issues. We are 
updating it when containers arrive but also in the 
{{shedQueuedOpportunisticContainers}}.

- {{queuedGuaranteedContainers}} and {{queuedOpportunisticContainers}}: I think 
we should use queues. I don't think we retrieve the container by the key 
anywhere either ways.

- {{oppContainersMarkedForKill}}: could be a Set, right?

- {{scheduledToRunContainers}} are containers that are either already running 
or are going to run very soon (transitioning from SCHEDULED to RUNNING state). 
Name is a bit misleading, because it sounds like they are only the ones 
belonging to the second category. I would rather say {{runningContainers}} and 
specify in a comment that they might not be running at this very moment but 
will be running very soon.

- In the {{onContainerCompleted()}}, the 
{{scheduledToRunContainers.remove(container.getContainerId())}} and the 
{{startPendingContainers()}} can go inside the if statement above. If the 
container was not running and no resources were freed up, we don't need to call 
the {{startPendingContainers()}}.

- fields of the {{opportunisticContainersStatus}} are set in different places. 
Due to that, when we call {{getOpportunisticContainersStatus()}} we may see an 
inconsistent object. Let's set the fields only in the 
{{getOpportunisticContainersStatus()}}.

- line 252, indeed we can now do extraOpportContainersToKill -> 
opportContainersToKill, as Karthik mentioned at a comment.

- line 87: increase -> increases

- {{shedQueuedOpportunisticContainers}}: 
-- {{numAllowed}} is the number of allowed containers in the queue. Instead, we 
are killing numAllowed containers. In other words, we should not kill 
numAllowed, but {{queuedOpportunisticContainers.size() - numAllowed}}.
-- "Container Killed to make room for Guaranteed Container." -> "Container 
killed to meet NM queuing limits". Instead of kill, you can also say de-queued.


> Add SCHEDULE to NM container lifecycle
> --------------------------------------
>
>                 Key: YARN-4597
>                 URL: https://issues.apache.org/jira/browse/YARN-4597
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>            Reporter: Chris Douglas
>            Assignee: Arun Suresh
>              Labels: oct16-hard
>         Attachments: YARN-4597.001.patch, YARN-4597.002.patch, 
> YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, 
> YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, 
> YARN-4597.009.patch
>
>
> Currently, the NM immediately launches containers after resource 
> localization. Several features could be more cleanly implemented if the NM 
> included a separate stage for reserving resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to