[
https://issues.apache.org/jira/browse/YARN-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinod Kumar Vavilapalli updated YARN-1972:
------------------------------------------
Issue Type: Sub-task (was: Improvement)
Parent: YARN-732
> Implement secure Windows Container Executor
> -------------------------------------------
>
> Key: YARN-1972
> URL: https://issues.apache.org/jira/browse/YARN-1972
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: nodemanager
> Reporter: Remus Rusanu
> Assignee: Remus Rusanu
> Labels: security, windows
> Attachments: YARN-1972.1.patch, YARN-1972.2.patch, YARN-1972.3.patch,
> YARN-1972.delta.4.patch, YARN-1972.delta.5.patch, YARN-1972.trunk.4.patch,
> YARN-1972.trunk.5.patch
>
>
> h1. Windows Secure Container Executor (WCE)
> YARN-1063 adds the necessary infrasturcture to launch a process as a domain
> user as a solution for the problem of having a security boundary between
> processes executed in YARN containers and the Hadoop services. The WCE is a
> container executor that leverages the winutils capabilities introduced in
> YARN-1063 and launches containers as an OS process running as the job
> submitter user. A description of the S4U infrastructure used by YARN-1063
> alternatives considered can be read on that JIRA.
> The WCE is based on the DefaultContainerExecutor. It relies on the DCE to
> drive the flow of execution, but it overwrrides some emthods to the effect of:
> * change the DCE created user cache directories to be owned by the job user
> and by the nodemanager group.
> * changes the actual container run command to use the 'createAsUser' command
> of winutils task instead of 'create'
> * runs the localization as standalone process instead of an in-process Java
> method call. This in turn relies on the winutil createAsUser feature to run
> the localization as the job user.
>
> When compared to LinuxContainerExecutor (LCE), the WCE has some minor
> differences:
> * it does no delegate the creation of the user cache directories to the
> native implementation.
> * it does no require special handling to be able to delete user files
> The approach on the WCE came from a practical trial-and-error approach. I had
> to iron out some issues around the Windows script shell limitations (command
> line length) to get it to work, the biggest issue being the huge CLASSPATH
> that is commonplace in Hadoop environment container executions. The job
> container itself is already dealing with this via a so called 'classpath
> jar', see HADOOP-8899 and YARN-316 for details. For the WCE localizer launch
> as a separate container the same issue had to be resolved and I used the same
> 'classpath jar' approach.
> h2. Deployment Requirements
> To use the WCE one needs to set the
> `yarn.nodemanager.container-executor.class` to
> `org.apache.hadoop.yarn.server.nodemanager.WindowsSecureContainerExecutor`
> and set the `yarn.nodemanager.windows-secure-container-executor.group` to a
> Windows security group name that is the nodemanager service principal is a
> member of (equivalent of LCE
> `yarn.nodemanager.linux-container-executor.group`). Unlike the LCE the WCE
> does not require any configuration outside of the Hadoop own's yar-site.xml.
> For WCE to work the nodemanager must run as a service principal that is
> member of the local Administrators group or LocalSystem. this is derived from
> the need to invoke LoadUserProfile API which mention these requirements in
> the specifications. This is in addition to the SE_TCB privilege mentioned in
> YARN-1063, but this requirement will automatically imply that the SE_TCB
> privilege is held by the nodemanager. For the Linux speakers in the audience,
> the requirement is basically to run NM as root.
> h2. Dedicated high privilege Service
> Due to the high privilege required by the WCE we had discussed the need to
> isolate the high privilege operations into a separate process, an 'executor'
> service that is solely responsible to start the containers (incloding the
> localizer). The NM would have to authenticate, authorize and communicate with
> this service via an IPC mechanism and use this service to launch the
> containers. I still believe we'll end up deploying such a service, but the
> effort to onboard such a new platfrom specific new service on the project are
> not trivial.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)