[ 
https://issues.apache.org/jira/browse/YARN-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037045#comment-15037045
 ] 

Wangda Tan commented on YARN-2885:
----------------------------------

Thanks [~asuresh] working on this JIRA, took a quick glance at your patch, some 
questions/comments:

1)
[~sriramsrao] mentioned at: 
https://issues.apache.org/jira/browse/YARN-2877?focusedCommentId=14221991&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14221991
bq. Capacity is enforced for guaranteed-start containers. For queueable 
containers, policies could be pushed down from central-RM (YARN-2885)

I'm not sure if it is possibly that queueable resource requests could be also 
sent to RM with this implementation.

2) I'm not quite sure why isDistributedSchedulingEnabled is required for AM's 
AllocateRequest and RegisterRequest. In my mind if AM doesn't want queueable 
container, it should simply do not send queueable resource request. If you 
agree with 1), AM should be agnostic to a container is allocated by a RM or NM, 
it should simply know an allocated container is queueable or guaranteed.

3) Why adding separated configurations for distributed scheduling, such as:
bq. YarnConfiguration.DIST_SCHEDULING_ENABLED
IIUC, ApplicationMasterService is running at resource manager, am I correct?

4) Some questions/suggestions regarding RegisterApplicationMasterResponse:
- Add a separated class to encapsulate all queueable-request related 
information. It will be null if distributed scheduling is disabled.
- Such information could be changed during application master's lifespan, so do 
you think if we need to add such information to AllocateResponse?
- What's the getMinAllocatableCapabilty and getMaxAllocatableCapabilty? Is it 
as same as minimumAllocation/maximumAllocation? If so, why not use the RM's 
minimumAllocation/maximumAllocation?
- Why AM needs to know getContainerIdStart?
- Is it possible containerTokenExpiryInterval could be varies at different NMs? 
If so, is it better to add expiryInterval to created container?
- getNodeList is not clear enough, maybe call it getQueueableSupportedNodesList?

5) Could you make API changes to a independent patch? I think other features 
such as centralized resource over-subscription (YARN-1011) could leverage the 
same set of APIs. 

> Create AMRMProxy request interceptor for distributed scheduling decisions for 
> queueable containers
> --------------------------------------------------------------------------------------------------
>
>                 Key: YARN-2885
>                 URL: https://issues.apache.org/jira/browse/YARN-2885
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Konstantinos Karanasos
>            Assignee: Arun Suresh
>         Attachments: YARN-2885-yarn-2877.001.patch
>
>
> We propose to add a Local ResourceManager (LocalRM) to the NM in order to 
> support distributed scheduling decisions. 
> Architecturally we leverage the RMProxy, introduced in YARN-2884. 
> The LocalRM makes distributed decisions for queuable containers requests. 
> Guaranteed-start requests are still handled by the central RM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to