[ 
https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16212848#comment-16212848
 ] 

Miklos Szegedi commented on YARN-4511:
--------------------------------------

Thank you, [~haibochen] for the patch.
{code}
342         // notify schedulerNode of the update to correct resource accounting
343         node.containerUpdated(existingRMContainer, existingContainer);
344     
345         
((RMContainerImpl)tempRMContainer).setContainer(updatedTempContainer);
346         // notify SchedulerNode of the update to correct resource accounting
347         node.containerUpdated(tempRMContainer, tempContainer);
348     
{code}
I think that it would be nicer to lock around these two calls to become atomic.
{code}
431       public int getNumOpportunisticContainers() {
432         return numOpportunisticContainers;
321       }
{code}
This function takes a sample but does not lock. This is fine, however we need 
to make sure it reflects the state of the object, so for example 
allocateContainer() should set this value as the last step after the 
allocatedContainers.put() call.
If containerResourceAllocated fails in guaranteedContainerResourceAllocated we 
will still call allocatedContainers.put(). I think this may cause some 
inconsistencies in the future. Probably it is better to propagate the false 
return code all the way to the caller.
isValidGuaranteedContainer and isValidOpportunisticContainer contain the same 
code. Should they be different? Would an isValidContainer function be 
sufficient?
{code}
294         Container container = rmContainer.getContainer();
295         if (container.getExecutionType() == ExecutionType.GUARANTEED) {
296           guaranteedContainerResourceReleased(container);
297           allocatedContainers.remove(containerId);
298           numGuaranteedContainers--;
299         } else {
300           opportunisticContainerResourceReleased(container);
301           numOpportunisticContainers--;
302           allocatedContainers.remove(containerId);
303         }
{code}
allocatedContainers.remove(containerId); can be placed outside the if.

containerResourceReleased should decrease resourceAllocatedPendingLaunch, if 
the container has not been started, yet.

guaranteedContainerResourceReleased may fail inside but regardless of the 
outcome, we decrease numGuaranteedContainers.
{{ + ", which has " + getNumGuaranteedContainers() + " containers, "}} should 
be {{ + ", which has " + getNumGuaranteedContainers() + " guaranteed 
containers, "}}
I do not see unit tests added for getNumOpportunisticContainers() and 
opportunistic container code paths added in general.


> Common scheduler changes supporting scheduler-specific implementations
> ----------------------------------------------------------------------
>
>                 Key: YARN-4511
>                 URL: https://issues.apache.org/jira/browse/YARN-4511
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Wangda Tan
>            Assignee: Haibo Chen
>         Attachments: YARN-4511-YARN-1011.00.patch, 
> YARN-4511-YARN-1011.01.patch, YARN-4511-YARN-1011.02.patch, 
> YARN-4511-YARN-1011.03.patch, YARN-4511-YARN-1011.04.patch, 
> YARN-4511-YARN-1011.05.patch, YARN-4511-YARN-1011.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to