[ 
https://issues.apache.org/jira/browse/YARN-7612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16303051#comment-16303051
 ] 

Weiwei Yang commented on YARN-7612:
-----------------------------------

Hi [~asuresh]

I know this is already committed, but I have one concern that might cause race 
condition. Here is some assumptions before my question
#  {{AllocationTagsManager}} maintains tags to nodes mapping with respect to 
the actual container states, that means this info gets updated with the change 
of different states in the container lifecycle.
# A certain {{ConstraintPlacementAlgorithm}} calls {{AllocationTagsManager}} to 
get current state of tags on nodes and make placement proposal as 
{{ConstraintPlacementAlgorithmOutput}}.
# Scheduler try to allocate resources on node candidates according to the 
{{ConstraintPlacementAlgorithmOutput}}, commit the change if applicable or 
reject it if not.

Question

Do #3 (during actual allocation in the scheduler) check tags again? If not, 
what if tags changed after the proposal is made? For example, the original 
request specified placement constraints that to place the container on a node 
that doesn't have tag "spark", the algorithm gives it a node "nodeA"; however, 
before the scheduler actually calls {{attemptAllocationOnNode}}, there is a 
spark container allocated on "nodeA". If scheduler doesn't check the tags 
state, it will successfully allocate container on "nodeA" and violate the 
constraint.

To ensure we have a consistent state, I think we need to check the 
allocation-tags/node-attributes before scheduler actually allocates a 
container. Has such problem been addressed in the design?

Thanks



> Add Processor Framework for Rich Placement Constraints
> ------------------------------------------------------
>
>                 Key: YARN-7612
>                 URL: https://issues.apache.org/jira/browse/YARN-7612
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>             Fix For: 3.1.0
>
>         Attachments: YARN-7612-YARN-6592.001.patch, 
> YARN-7612-YARN-6592.002.patch, YARN-7612-YARN-6592.003.patch, 
> YARN-7612-YARN-6592.004.patch, YARN-7612-YARN-6592.005.patch, 
> YARN-7612-YARN-6592.006.patch, YARN-7612-YARN-6592.007.patch, 
> YARN-7612-YARN-6592.008.patch, YARN-7612-YARN-6592.009.patch, 
> YARN-7612-YARN-6592.010.patch, YARN-7612-YARN-6592.011.patch, 
> YARN-7612-YARN-6592.012.patch, YARN-7612-v2.wip.patch, YARN-7612.wip.patch
>
>
> This introduces a Placement Processor and a Planning algorithm framework to 
> handle placement constraints and scheduling requests from an app and places 
> them on nodes.
> The actual planning algorithm(s) will be handled in a YARN-7613.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to