Wangda Tan commented on YARN-3964:

Hi [~dian.fu],
Thanks for comments and the patch. I took a quick look at the patch, some 
problems I can see now:
- It involves some unnecessary interface/parameter to NodeLabelsProvider, this 
also leads to unnecessary changes to NM
- Fetcher implementation is polling updated labels for ALL NMs in the cluster, 
if a cluster has several thousands of NMs, this can be inefficient.

My biggest concern is still about if this change is must-to-have:
Since we already have a set of APIs to do this, I can't see a big add-on value 
of doing this inside RM. 
For example, we provided submitApplication REST API so that downstream 
applications can use, YARN RM won't do things beyond parse the submission 
request and launch the application. Complex requirements such as cron job will 
be handled outside of YARN, such as Oozie/Slider, YARN doesn't support 
"ApplicationSubmissionProvider" inside of RM even if sometimes it sounds more 

IMHO, pluggable functionality needs only be added to RM if it is necessary, 
otherwise it becomes over design. Resource scheduler is one example, it coupled 
with lots of components so it's very hard to be pulled to outside of RM. 
RMNodeLabelsProvider only couples with RMNodeLabelsManager in the patch, adding 
labels to RMNodeLabelsManager inside RM is as same as doing it via REST API / 

Please let me know if you have any other concerns/comments.


> Support NodeLabelsProvider at Resource Manager side
> ---------------------------------------------------
>                 Key: YARN-3964
>                 URL: https://issues.apache.org/jira/browse/YARN-3964
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Dian Fu
>            Assignee: Dian Fu
>         Attachments: YARN-3964 design doc.pdf, YARN-3964.1.patch
> Currently, CLI/REST API is provided in Resource Manager to allow users to 
> specify labels for nodes. For labels which may change over time, users will 
> have to start a cron job to update the labels. This has the following 
> limitations:
> - The cron job needs to be run in the YARN admin user.
> - This makes it a little complicate to maintain as users will have to make 
> sure this service/daemon is alive.
> Adding a Node Labels Provider in Resource Manager will provide user more 
> flexibility.

This message was sent by Atlassian JIRA

Reply via email to