Wangda Tan commented on YARN-3964:

Hi [~dian.fu],
I just went through the design doc.

According to the design doc, configuration of centralized provider also needs 
to consider 1) frequency of invoking fetch script. 2) permission of script. 3) 
a customized provider 4) configuration RM classpath to make use of it. It's not 
simple comparing to use cron + script + REST-API/CLI.

To some points in the design doc:
bq. Why need provider instead of REST API -- REST API requires admin privilege. 
Configuration/executing provider script also needs YARN's admin permission.

bq. User can provide their own node labels provider plug-in which can fetch 
labels from a database, from a remote server, etc. 
Using a cron job invoke customized script can also achieve this.

And this approach also needs to add a plugin inside RM, which is could be 
unsafe and sometimes cause RM dies if the provider has some issues.

IMHO, this will add unnecessary complexity to RM, risks to RM, and cannot make 
easier configuration. Please correct me if I missed anything.


> Support NodeLabelsProvider at Resource Manager side
> ---------------------------------------------------
>                 Key: YARN-3964
>                 URL: https://issues.apache.org/jira/browse/YARN-3964
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Dian Fu
>            Assignee: Dian Fu
>         Attachments: YARN-3964 design doc.pdf
> Currently, CLI/REST API is provided in Resource Manager to allow users to 
> specify labels for nodes. For labels which may change over time, users will 
> have to start a cron job to update the labels. This has the following 
> limitations:
> - The cron job needs to be run in the YARN admin user.
> - This makes it a little complicate to maintain as users will have to make 
> sure this service/daemon is alive.
> Adding a Node Labels Provider in Resource Manager will provide user more 
> flexibility.

This message was sent by Atlassian JIRA

Reply via email to