[
https://issues.apache.org/jira/browse/YARN-7783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333859#comment-16333859
]
Arun Suresh commented on YARN-7783:
-----------------------------------
bq. this change looks pretty intrusive. Changes like temp tag for the internal
algorithm (including AppPlacementAllocator should not be a part of
AllocationTagsManager.
Hmm.. could you clarify why you find this intrusive. This patch just modifies
the *addTempContainer* method and the *appTempMappings* field, which is not
even used by the AppPlacementAllocator in the first place - so should not
affect its code paths in any way. :)
bq. This is bad for inter-application affinity. Intra-application is better
since it's easier to control requests within the same app. A simpler way to
fix this problem is restricting anti-affinity only to its own allocation tags
For inter-app, it is definitely bad. But the example I specified in the
description is for INTRA-app, though. Without a fix, we have to state that the
feature only works if source and target tags are the same - which I think is a
bit too naive / restrictive, since our goal is "rich constraint placement" :).
I would still consider this a blocker.
> Add validation step to ensure constraints are not violated due to order in
> which a request is processed
> -------------------------------------------------------------------------------------------------------
>
> Key: YARN-7783
> URL: https://issues.apache.org/jira/browse/YARN-7783
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Arun Suresh
> Assignee: Arun Suresh
> Priority: Blocker
> Attachments: YARN-7783-YARN-6592.001.patch
>
>
> When the algorithm has placed a container on a node, allocation tags are
> added to the node if the constraint is satisfied, But depending on the order
> in which the algorithm sees the request, it is possible that a constraint
> that happen to be valid during placement of an earlier-seen request, might
> not be valid after all subsequent requests have been placed.
> For eg:
> Assume nodes n1, n2, n3, n4 and n5
> Consider the 2 constraints:
> # *foo* -> anti-affinity with *foo*
> # *bar* -> anti-affinity with *foo*
> And 2 requests
> # req1: NumAllocations = 4, allocTags = [foo]
> # req2: NumAllocations = 1, allocTags = [bar]
> If *req1* is seen first, the algorithm can place the 4 containers in n1, n2,
> n3 and n4. And when it gets to *req2*, it will see that 4 nodes have the
> *foo* tag and will place it on n5. But if *req2* is seen first, then *bar*
> tag will be placed on any node, since no node will at that point have *foo*,
> and then when it gets to *req1*, since *foo* has no anti-affinity with *bar*,
> the algorithm can end up placing *foo* on a node with *bar* violating the
> second constraint.
> To prevent the above, we need a validation step: after the placements for a
> batch of requests are made, then for each req, we remove its tags from the
> node and try to see of constraints are still satisfied if the tag were to be
> added back on the node.
> When applied to the example above, after the algorithm has run through *req2*
> and then *req1*, we remove the *bar* tag from the node and try to add it back
> on the node. This time, constraint satisfaction will fail, since there is now
> a *foo* tag on the node and *bar* cannot be added. The algorithm will then
> retry placing *req2* on another node.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]