[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

Eric Yang (JIRA) Wed, 29 Aug 2018 09:34:35 -0700


    [ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596578#comment-16596578
 ]


Eric Yang commented on YARN-8569:
---------------------------------

{quote}There were tons of debates regarding to yarn user should be treated as 
root or not before. We saw some issues of c-e causes yarn user can manipulate 
other user's directories, or directly escalate to root user. All of these 
issues become CVE.{quote}

If I recall correctly, I reported and fixed container-executor security issues 
like YARN-7590 and YARN-8207.  I think I have written proper security check to 
make sure the caller via network has the right Kerberos tgt that matches end 
user's container directory and also validated the source of the data is coming 
from node manager private directory.  There is a permission validation to copy 
spec file information from node manager private directory to end user container 
directory.  This design is similar to transporting delegation tokens to 
container working directory.  I think there are good enough security 
validations to ensure no security hole has been added to this work.  Let me 
know if you find security holes.

{quote}
>From YARN's design purpose, ideally all NM/RM logics should be as general as 
>possible, all service-related stuffs should be handled by service framework 
>like API server or ServiceMaster. I really don't like the idea of adding 
>service-specific API to NM API.{quote}

The new API is not YARN service framework specific.  ContainerExecutor provide 
basic API for starting, stopping, and clean up containers, but it is missing 
more sophisticated API like synchronize configuration among containers.  The 
new API syncYarnSysFS is proposed to allow ContainerExecutor developer to write 
their own implementation of populating text information to /hadoop/yarn/sysfs.  
Custom AM can be written to use the new API and populate other text 
information.  The newly added API in node manager is generic and avoid the 
double serialization cost exists in container manager protobuf and rpc code.  
There is no extra serialization cost of the content during transport, thus the 
new API is more efficient and light weight.  There is nothing specific to YARN 
service, although YARN service is the first consumer for this API.

{quote}
1) ServiceMaster ro mount a local directory (under the container's local dir) 
when launch docker container (example like: ./service-info -> /service/sys/fs/) 

2) ServiceMaster request to re-localize new service spec json file to the 
./service-info folder.
{quote}

What is "ro mount" in the first sentence?  Is it remote mount or read-only 
mount?  ServiceMaster is not guarantee to run on the same node as other 
container.  Hence, there is no practical way to mount service master's 
directory to container's local directory cross nodes.

> Create an interface to provide cluster information to application
> -----------------------------------------------------------------
>
>                 Key: YARN-8569
>                 URL: https://issues.apache.org/jira/browse/YARN-8569
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>            Priority: Major
>              Labels: Docker
>         Attachments: YARN-8569.001.patch, YARN-8569.002.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>      --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 \
>      --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 \
>      --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>      --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 \
>      --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 \
>      --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>      --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 \
>      --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 \
>      --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>      --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 \
>      --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 \
>      --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

Reply via email to