Yuqi Wang updated YARN-7872:
    Target Version/s: 3.0.0, 2.7.2  (was: 2.7.2)

> labeled node cannot be used to satisfy locality specified request
> -----------------------------------------------------------------
>                 Key: YARN-7872
>                 URL: https://issues.apache.org/jira/browse/YARN-7872
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler, capacityscheduler, resourcemanager
>    Affects Versions: 2.7.2
>            Reporter: Yuqi Wang
>            Assignee: Yuqi Wang
>            Priority: Blocker
>             Fix For: 2.7.2
>         Attachments: YARN-7872-branch-
> *Issue summary:*
> labeled node (i.e. node with 'not empty' node label) cannot be used to 
> satisfy locality specified request (i.e. container request with 'not ANY' 
> resource name and the relax locality is false).
> *For example:*
> The node with available resource:
> [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: 
> [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: 
> \{/default-rack}]
> The container request:
>  [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] 
> {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: 
> \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}]
> Current RM capacity scheduler's behavior is that (at least for version 2.7 
> and 2.8), the node cannot allocate container for the request, because the 
> node label is not matched when the leaf queue assign container.
> *Possible solution:*
> However, node locality and node label should be two orthogonal dimensions to 
> select candidate nodes for container request. And the node label matching 
> should only be executed for container request with ANY resource name, since 
> only this kind of container request is allowed to have 'not empty' node label.
> So, for container request with 'not ANY' resource name (so, we clearly know 
> it should not have node label), we should use the requested resource name to 
> match with the node instead of using the requested node label to match with 
> the node. And this resource name matching should be safe, since the node 
> whose node label is not accessible for the queue will not be sent to the leaf 
> queue.
> *Discussion:*
> Attachment is the fix according to this principle, please help to review.
> Without it, we cannot use locality to request container within these labeled 
> nodes.
> If the fix is acceptable, we should also recheck whether the same issue 
> happens in trunk and other hadoop versions.
> If not acceptable (i.e. the current behavior is by designed), so, how can we 
> use locality to request container within these labeled nodes?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to