Maysam Yabandeh commented on YARN-2430:

Here are the current alternative solutions:

1. a simple, quick fix would be to cache the result of getResourceUsage in a 
field of Schedulable and invalidate the cache after each scheduling. The 
invalidation requires iteration on all schedulables with cost O( n ).

2. alternatively as suggested by Karthik the cached result could be updated 
periodically as part of UpdateThread. This approach would also encourage moving 
the sorting also to the UpdateThread since the sort algorithm is no longer 
provided with the most up-to-date data.

3. Karthik also brought up the option of bottom-up update of the resource usage 
when something gets updated: each Schedulable pushes up the change in its 
resource usage after each change. This would require invoking the push-up 
method at the right places. Care must be taken in future changes not to forget 
calling the push-up method.

I would highly appreciate the comments.

> FairShareComparator: cache the results of getResourceUsage()
> ------------------------------------------------------------
>                 Key: YARN-2430
>                 URL: https://issues.apache.org/jira/browse/YARN-2430
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Maysam Yabandeh
>            Assignee: Maysam Yabandeh
> The compare of FairShareComparator has 3 invocation of  getResourceUsage per 
> comparable object. In the case of queues, the implementation of 
> getResourceUsage requires iterating over the apps and adding up their current 
> usage. The compare method can reuse the result of getResourceUsage to reduce 
> the load by third. However, to further reduce the load the result of 
> getResourceUsage can be cached in FSLeafQueue. This would be more efficient 
> since the invocation of compare method on each Comparable object is >= 1.

This message was sent by Atlassian JIRA

Reply via email to