[
https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16212848#comment-16212848
]
Miklos Szegedi commented on YARN-4511:
--------------------------------------
Thank you, [~haibochen] for the patch.
{code}
342 // notify schedulerNode of the update to correct resource accounting
343 node.containerUpdated(existingRMContainer, existingContainer);
344
345
((RMContainerImpl)tempRMContainer).setContainer(updatedTempContainer);
346 // notify SchedulerNode of the update to correct resource accounting
347 node.containerUpdated(tempRMContainer, tempContainer);
348
{code}
I think that it would be nicer to lock around these two calls to become atomic.
{code}
431 public int getNumOpportunisticContainers() {
432 return numOpportunisticContainers;
321 }
{code}
This function takes a sample but does not lock. This is fine, however we need
to make sure it reflects the state of the object, so for example
allocateContainer() should set this value as the last step after the
allocatedContainers.put() call.
If containerResourceAllocated fails in guaranteedContainerResourceAllocated we
will still call allocatedContainers.put(). I think this may cause some
inconsistencies in the future. Probably it is better to propagate the false
return code all the way to the caller.
isValidGuaranteedContainer and isValidOpportunisticContainer contain the same
code. Should they be different? Would an isValidContainer function be
sufficient?
{code}
294 Container container = rmContainer.getContainer();
295 if (container.getExecutionType() == ExecutionType.GUARANTEED) {
296 guaranteedContainerResourceReleased(container);
297 allocatedContainers.remove(containerId);
298 numGuaranteedContainers--;
299 } else {
300 opportunisticContainerResourceReleased(container);
301 numOpportunisticContainers--;
302 allocatedContainers.remove(containerId);
303 }
{code}
allocatedContainers.remove(containerId); can be placed outside the if.
containerResourceReleased should decrease resourceAllocatedPendingLaunch, if
the container has not been started, yet.
guaranteedContainerResourceReleased may fail inside but regardless of the
outcome, we decrease numGuaranteedContainers.
{{ + ", which has " + getNumGuaranteedContainers() + " containers, "}} should
be {{ + ", which has " + getNumGuaranteedContainers() + " guaranteed
containers, "}}
I do not see unit tests added for getNumOpportunisticContainers() and
opportunistic container code paths added in general.
> Common scheduler changes supporting scheduler-specific implementations
> ----------------------------------------------------------------------
>
> Key: YARN-4511
> URL: https://issues.apache.org/jira/browse/YARN-4511
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Wangda Tan
> Assignee: Haibo Chen
> Attachments: YARN-4511-YARN-1011.00.patch,
> YARN-4511-YARN-1011.01.patch, YARN-4511-YARN-1011.02.patch,
> YARN-4511-YARN-1011.03.patch, YARN-4511-YARN-1011.04.patch,
> YARN-4511-YARN-1011.05.patch, YARN-4511-YARN-1011.06.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]