[ 
https://issues.apache.org/jira/browse/YARN-7399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7399:
----------------------------
    Attachment: YARN-7399.png

See the attached diagram for the current implementation and proposed 
refinement.  This will reduce duplicated code for storing metadata, and support 
multiple storage type.

> Yarn services metadata storage improvement
> ------------------------------------------
>
>                 Key: YARN-7399
>                 URL: https://issues.apache.org/jira/browse/YARN-7399
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: yarn-native-services
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>         Attachments: YARN-7399.png
>
>
> In Slider, metadata is stored in user's home directory. Slider command line 
> interface interacts with HDFS directly to list deployed applications and 
> invoke YARN API or HDFS API to provide information to user. This design works 
> for a single user manage his/her own applications. When this design has been 
> ported to Yarn services, it becomes apparent that this design is difficult to 
> list all deployed applications on Hadoop cluster for administrator to manage 
> applications. Resource Manager needs to crawl through every user's home 
> directory to compile metadata about deployed applications. This can trigger 
> high load on namenode to list hundreds or thousands of list directory calls 
> owned by different users. Hence, it might be best to centralize the metadata 
> storage to Solr or HBase to reduce number of IO calls to namenode for manage 
> applications.
> In Slider, one application is composed of metainfo, specifications in json, 
> and payload of zip file that contains application code and deployment code. 
> Both meta information, and zip file payload are stored in the same 
> application directory in HDFS. This works well for distributed applications 
> without central application manager that oversee all application.
> In the next generation of application management, we like to centralize 
> metainfo and specifications in json to a centralized storage managed by YARN 
> user, and keep the payload zip file in user's home directory or in docker 
> registry. This arrangement can provide a faster lookup for metainfo when we 
> list all deployed applications and services on YARN dashboard.
> When we centralize metainfo to YARN user, we also need to build ACL to 
> enforce who can manage applications, and make update. The current proposal is:
> yarn.admin.acl - list of groups that can submit/reconfigure/pause/kill all 
> applications
> normal users - submit/reconfigure/pause/kill his/her own applications



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to