[ 
https://issues.apache.org/jira/browse/YARN-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522495#comment-14522495
 ] 

Wangda Tan commented on YARN-3553:
----------------------------------

[~xinxianyin],
I think what you mentioned is a valid point, but it should be already resolved 
by {{updateSchedulingResourceUsage}}. When we reorder application, we will 
first remove it from TreeSet, call {{updateSchedulingResourceUsage}} to update 
it's cached demand/usage, and reinsert it back to TreeSet. And FairComparator 
also uses cached demand/pending to compare two applications.

If SchedulableEntity doesn't touch cached fields (it shouldn't touch), and 
TreeSet uses cached fields to compare items, this problem isn't existed, do you 
agree?

+[~cwelch]

> TreeSet is not a nice container for organizing schedulableEntities.
> -------------------------------------------------------------------
>
>                 Key: YARN-3553
>                 URL: https://issues.apache.org/jira/browse/YARN-3553
>             Project: Hadoop YARN
>          Issue Type: Wish
>          Components: scheduler
>            Reporter: Xianyin Xin
>
> For TreeSet, element is identified by comparator, not the object reference. 
> If any *attributes that used for comparing two elements* of an specific 
> element is modified by other methods, the TreeSet will be in an un-sorted 
> state, and cannot become sorted forever except that we reconstruct another 
> TreeSet with the elements. To avoid this, one must be *very careful* when 
> they try to modify the attributes (such as increase or decrease the used 
> capacity of a schedulabeEntity) of an object.
> An example in AbstractComparatorOrderingPolicy.java, Line63,
> {code}
>   protected void reorderSchedulableEntity(S schedulableEntity) {
>     //remove, update comparable data, and reinsert to update position in order
>     schedulableEntities.remove(schedulableEntity);
>     updateSchedulingResourceUsage(
>         schedulableEntity.getSchedulingResourceUsage());
>     schedulableEntities.add(schedulableEntity);
>   }
> {code}
> This method tries to remove the schedulableEntity first and then reinsert it 
> so as to reorder the set. However, the changes of the schedulableEntity 
> should be done in the middle of the above two operations. But the comparator 
> of the class is not clear, so we don't know which attributes of the 
> schedulableEntity was changed. If we changed the schedulableEntity outside 
> the method and then inform the orderingPolicy that we made such a change, the 
> operation "schedulableEntities.remove(schedulableEntity)" would not work 
> correctly since the element of a TreeSet is identified by comparator. Any 
> implement class of this abstract class should overwrite this method, but few 
> does. Another choice is that we make modification of a schedulableEntity 
> manually, but we mustn't forget to reorder the set when we do so and must 
> remember the order: remove, modify the attributes(used for comparing), 
> insert, or use an iterator to mark the schedulableEntity so that we can 
> remove and reinsert it correctly.
> YARN-897 is an example that we fell into the trap. If the comparator become 
> complex in future, e.g., we consider other types of resources in comparator, 
> such traps will be more and disperse anywhere, which makes it easy to let a 
> TreeSet become a un-sorted state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to