[ 
https://issues.apache.org/jira/browse/YARN-6163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15862471#comment-15862471
 ] 

ASF GitHub Bot commented on YARN-6163:
--------------------------------------

Github user kambatla commented on a diff in the pull request:

    https://github.com/apache/hadoop/pull/192#discussion_r100673210
  
    --- Diff: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java
 ---
    @@ -1106,6 +1111,97 @@ boolean isStarvedForFairShare() {
         return !Resources.isNone(fairshareStarvation);
       }
     
    +  /**
    +   * Helper method for {@link #getStarvedResourceRequests()}:
    +   * Given a map of visited {@link ResourceRequest}s, it checks if
    +   * {@link ResourceRequest} 'rr' has already been visited. The map is 
updated
    +   * to reflect visiting 'rr'.
    +   */
    +  private static boolean checkAndMarkRRVisited(
    +      Map<Priority, List<Resource>> visitedRRs, ResourceRequest rr) {
    +    Priority priority = rr.getPriority();
    +    Resource capability = rr.getCapability();
    +    if (visitedRRs.containsKey(priority)) {
    +      List<Resource> rrList = visitedRRs.get(priority);
    +      if (rrList.contains(capability)) {
    --- End diff --
    
    Yeah, looks like there is indeed a bug here. 
    
    Consider an app asks for one container each on two nodes on the same rack:
    - If this code encounters either of these node-local requests, it ignores 
the other node and the rack requests. Ignoring the other node-local request is 
undesired. 
    - if this code encounters the rack-local request, it ignores the node-local 
requests. This is desired. 
    
    Maybe, on encountering a node-local request, we should mark the rack and 
ANY as "visited". What do we do when we encounter rack or ANY first? Let me 
think more about this.  


> FS Preemption is a trickle for severely starved applications
> ------------------------------------------------------------
>
>                 Key: YARN-6163
>                 URL: https://issues.apache.org/jira/browse/YARN-6163
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: fairscheduler
>    Affects Versions: 2.9.0
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>         Attachments: yarn-6163-1.patch
>
>
> With current logic, only one RR is considered per each instance of marking an 
> application starved. This marking happens only on the update call that runs 
> every 500ms.  Due to this, an application that is severely starved takes 
> forever to reach fairshare based on preemptions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to