[jira] [Comment Edited] (YARN-7612) Add Processor Framework for Rich Placement Constraints

Arun Suresh (JIRA) Sun, 24 Dec 2017 22:02:21 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-7612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16303072#comment-16303072
 ]


Arun Suresh edited comment on YARN-7612 at 12/25/17 6:01 AM:
-------------------------------------------------------------

[~cheersyang], Thanks for diving deep..

So, lets assume anti-affinity constraint for 3 containers and the associated 
allocation-tag is "spark". We can enumerate three ways the request for the 
container can come in from a single app say *app1*. When *app1* starts up, it 
registers placement constraints with the RM stating that it requires 
anti-affinity for all scheduling requests with tag *spark*. Then:
# In a _single_ allocate call, it includes 1 SchedulingRequest object with 
numAllocations=3 and allocation tags=spark.
# In a _single_ allocate call, it includes 3 SchedulingRequest objects each 
with numAllocations=1 and allocation tags=spark, and each has different 
resource sizing.
# In the _first_ allocate call, it includes 2 SchedulingRequest objects each 
with numAllocations=1 and allocation tags=spark - AND then in the _second_ 
allocate call, it includes 1 SchedulingRequest object with numAllocations=1 and 
allocation tags=spark.

Now, for cases 1 and 2, Since all the requests will exist in the same 
AlgorithmInput (since we batch requests we get in a single allocate call), all 
three requests will be considered at the same time by the algorithm and for all 
three of the requests, the state of the TagsManager seen by the Algorithm will 
be the same.
For case 3, I agree, we can end up with a a situation you stated, depending on 
the timing of the second allocate call and the size of the placement and 
scheduling thread pool - by default it is 1, in which case, this is less likely 
to happen.

I think we mentioned in the BatchedRequests javadoc (and we should make that 
explicit in the final docs as well) that for optimal placements, it is 
recommended that Applications send all related scheduling requests - those 
associated with the same allocation tags - in the same allocate call. In 
anycase, we are targetting _SoftConstraints_ in the first cut.

For _HardConstraints_, yes I agree, we need an extra check in the 
{{attemptAllocationOnNode}} phase. Maybe exposing a canAssign method that takes 
the PlacementConstraint, TagsManager and container tag. I think [~pgaref] is 
adding a {{canAssign}} method in YARN-7613, but we need to call it. Feel free 
to raise a JIRA to add that and I can help review.

But, to be honest, even then, we cannot guarantee the constraint is perfectly 
honored - unless the 
{{yarn.resourcemanager.placement-constraints.scheduler.pool-size}} == 1.



was (Author: asuresh):
[~cheersyang], Thanks for diving deep..

So, lets assume anti-affinity constraint for 3 containers and the associated 
allocation-tag is "spark". We can enumerate three ways the request for the 
container can come in from a single app say *app1*. When *app1* starts up, it 
registers placement constraints with the RM stating that it requires 
anti-affinity for all scheduling requests with tag *spark*. Then:
# In a _single_ allocate call, it includes 1 SchedulingRequest object with 
numAllocations=3 and allocation tags=spark.
# In a _single_ allocate call, it includes 3 SchedulingRequest objects each 
with numAllocations=1 and allocation tags=spark, and each has different 
resource sizing.
# In the _first_ allocate call, it includes 2 SchedulingRequest objects each 
with numAllocations=1 and allocation tags=spark - AND then in the _second_ 
allocate call, it includes 1 SchedulingRequest object with numAllocations=1 and 
allocation tags=spark.

Now, for cases 1 and 2, Since all the requests will exist in the same 
AlgorithmInput (since we batch requests we get in a single allocate call), all 
three requests will be considered at the same time by the algorithm and for all 
three of the requests, the state of the TagsManager seen by the Algorithm will 
be the same.
For case 3, I agree, we can end up with a a situation you stated, depending on 
the timing of the second allocate call and the size of the placement and 
scheduling thread pool - by default it is 1, in which case, this is less likely 
to happen.

I think we mentioned in the BatchedRequests javadoc (and we should make that 
explicit in the final docs as well) that for optimal placements, it is 
recommended that Applications send all related scheduling requests - those 
associated with the same allocation tags - in the same allocate call. In 
anycase, we are targetting _SoftConstraints_ in the first cut.

For _HardConstraints_, yes I agree, we need an extra check in the 
{{attemptAllocationOnNode}} phase. Maybe exposing a canAssign method that takes 
the PlacementConstraint, TagsManager and container tag. Feel free to raise a 
JIRA to add that and I can help review.

But, to be honest, even then, we cannot guarantee the constraint is perfectly 
honored - unless the 
{{yarn.resourcemanager.placement-constraints.scheduler.pool-size}} == 1.


> Add Processor Framework for Rich Placement Constraints
> ------------------------------------------------------
>
>                 Key: YARN-7612
>                 URL: https://issues.apache.org/jira/browse/YARN-7612
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>             Fix For: 3.1.0
>
>         Attachments: YARN-7612-YARN-6592.001.patch, 
> YARN-7612-YARN-6592.002.patch, YARN-7612-YARN-6592.003.patch, 
> YARN-7612-YARN-6592.004.patch, YARN-7612-YARN-6592.005.patch, 
> YARN-7612-YARN-6592.006.patch, YARN-7612-YARN-6592.007.patch, 
> YARN-7612-YARN-6592.008.patch, YARN-7612-YARN-6592.009.patch, 
> YARN-7612-YARN-6592.010.patch, YARN-7612-YARN-6592.011.patch, 
> YARN-7612-YARN-6592.012.patch, YARN-7612-v2.wip.patch, YARN-7612.wip.patch
>
>
> This introduces a Placement Processor and a Planning algorithm framework to 
> handle placement constraints and scheduling requests from an app and places 
> them on nodes.
> The actual planning algorithm(s) will be handled in a YARN-7613.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (YARN-7612) Add Processor Framework for Rich Placement Constraints

Reply via email to