[ 
https://issues.apache.org/jira/browse/YARN-9834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shanyu zhao updated YARN-9834:
------------------------------
    Description: 
Yarn Secure Container in secure mode allows separation of different user's 
local files and container processes running on the same node manager. This 
depends on an out of band service such as SSSD/Winbind to sync all domain users 
to local machine.

SSSD/Winbind user sync has lots of overhead, especially for large corporations. 
Also if running Yarn inside Kubernetes cluster (meaning node managers running 
inside Docker container), it doesn't make sense for each container to sync a 
whole copy of domain users.

We should allow a new configuration to Yarn, such that we can pre-create a pool 
of users on each machine/Docker container. And at runtime, Yarn allocates a 
local user to the domain user that submits the application. When all containers 
of that user and all files belonging to that user are deleted, we can release 
the allocation and allow other users to use the same local user to run their 
Yarn containers.
h2. Design

We propose to add these new configurations:
{code:java}
yarn.nodemanager.linux-container-executor.secure-mode.use-local-user, defaults 
to false
yarn.nodemanager.linux-container-executor.secure-mode.local-user-prefix, 
defaults to "user"{code}
By default this feature is turned off. If we enable it, with local-user-prefix 
set to "user", then we expect there are pre-created local users user0 - usern, 
where n equals to:
{code:java}
yarn.nodemanager.resource.cpu-vcores {code}
We can use an in-memory allocator to keep the domain user to local user 
mapping. When to add the mapping and when to remove it?

In node manager, ApplicationImpl implements the state machine for a Yarn app 
life cycle, only if the app has at least 1 container running on that node 
manager. We can hook up the code to add the mapping during application 
initialization.

For removing the mapping, we need to wait for 3 things:

1) All applications of the same user is completed;
2) All log handling of the applications (log aggregation or non-aggregated 
handling) is done;
3) All pending FileDeletionTask that use the user's identity is finished.
h2. Limitations

1) This feature does not support PRIVATE visibility type of resource 
allocation. Because PRIVATE type of resources are potentially cached in the 
node manager for a very long time, supporting it will be a security problem 
that a user might be able to peek into previous user's PRIVATE resources. We 
can modify code to treat all PRIVATE type of resource as APPLICATION type.

2) It is recommended to enable DominantResourceCalculator so that no more than 
"cpu-vcores" number of concurrent containers running on a node manager:
{code:java}
yarn.scheduler.capacity.resource-calculator
= org.apache.hadoop.yarn.util.resource.DominantResourceCalculator {code}
3) Currently this feature does not work with Yarn Node Manager recovery. We may 
add recovery support in the future when we hook up with the right calls in the 
recovery flow.

 

  was:
Yarn Secure Container in secure mode allows separation of different user's 
local files and container processes running on the same node manager. This 
depends on an out of band service such as SSSD/Winbind to sync all domain users 
to local machine.

SSSD/Winbind user sync has lots of overhead, especially for large corporations. 
Also if running Yarn inside Kubernetes cluster (meaning node managers running 
inside Docker container), it doesn't make sense for each container to sync a 
whole copy of domain users.

We should allow a new configuration to Yarn, such that we can pre-create a pool 
of users on each machine/Docker container. And at runtime, Yarn allocates a 
local user to the domain user that submits the application. When all containers 
of that user and all files belonging to that user are deleted, we can release 
the allocation and allow other users to use the same local user to run their 
Yarn containers.

We propose to add these new configurations:
{code:java}
yarn.nodemanager.linux-container-executor.secure-mode.use-local-user, defaults 
to false
yarn.nodemanager.linux-container-executor.secure-mode.local-user-prefix, 
defaults to "user"{code}
If we enable this feature, with local-user-prefix set to "user", then we expect 
there are pre-created local users user0 - usern, where n equals to:
{code:java}
yarn.nodemanager.resource.cpu-vcores {code}
We can use an in-memory allocator to keep the domain user to local user mapping.

Limitations:

1) This feature does not support PRIVATE type of resource allocation. Because 
PRIVATE type of resources are potentially cached in the node manager for a very 
long time, supporting it will be a security problem that a user might be able 
to peek into previous user's PRIVATE resources. We can modify code to treat all 
PRIVATE type of resource as APPLICATION.

2) It is recommended to enable DominantResourceCalculator so that no more than 
"cpu-vcores" number of concurrent containers running on a node manager:
{code:java}
yarn.scheduler.capacity.resource-calculator
= org.apache.hadoop.yarn.util.resource.DominantResourceCalculator {code}
3) Currently this feature does not work with Yarn Node Manager recovery. We may 
add recovery support in the future when we hook up with the right calls in the 
recovery flow.

 


> Allow using a pool of local users to run Yarn Secure Container in secure mode
> -----------------------------------------------------------------------------
>
>                 Key: YARN-9834
>                 URL: https://issues.apache.org/jira/browse/YARN-9834
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.1.2
>            Reporter: shanyu zhao
>            Assignee: shanyu zhao
>            Priority: Major
>
> Yarn Secure Container in secure mode allows separation of different user's 
> local files and container processes running on the same node manager. This 
> depends on an out of band service such as SSSD/Winbind to sync all domain 
> users to local machine.
> SSSD/Winbind user sync has lots of overhead, especially for large 
> corporations. Also if running Yarn inside Kubernetes cluster (meaning node 
> managers running inside Docker container), it doesn't make sense for each 
> container to sync a whole copy of domain users.
> We should allow a new configuration to Yarn, such that we can pre-create a 
> pool of users on each machine/Docker container. And at runtime, Yarn 
> allocates a local user to the domain user that submits the application. When 
> all containers of that user and all files belonging to that user are deleted, 
> we can release the allocation and allow other users to use the same local 
> user to run their Yarn containers.
> h2. Design
> We propose to add these new configurations:
> {code:java}
> yarn.nodemanager.linux-container-executor.secure-mode.use-local-user, 
> defaults to false
> yarn.nodemanager.linux-container-executor.secure-mode.local-user-prefix, 
> defaults to "user"{code}
> By default this feature is turned off. If we enable it, with 
> local-user-prefix set to "user", then we expect there are pre-created local 
> users user0 - usern, where n equals to:
> {code:java}
> yarn.nodemanager.resource.cpu-vcores {code}
> We can use an in-memory allocator to keep the domain user to local user 
> mapping. When to add the mapping and when to remove it?
> In node manager, ApplicationImpl implements the state machine for a Yarn app 
> life cycle, only if the app has at least 1 container running on that node 
> manager. We can hook up the code to add the mapping during application 
> initialization.
> For removing the mapping, we need to wait for 3 things:
> 1) All applications of the same user is completed;
> 2) All log handling of the applications (log aggregation or non-aggregated 
> handling) is done;
> 3) All pending FileDeletionTask that use the user's identity is finished.
> h2. Limitations
> 1) This feature does not support PRIVATE visibility type of resource 
> allocation. Because PRIVATE type of resources are potentially cached in the 
> node manager for a very long time, supporting it will be a security problem 
> that a user might be able to peek into previous user's PRIVATE resources. We 
> can modify code to treat all PRIVATE type of resource as APPLICATION type.
> 2) It is recommended to enable DominantResourceCalculator so that no more 
> than "cpu-vcores" number of concurrent containers running on a node manager:
> {code:java}
> yarn.scheduler.capacity.resource-calculator
> = org.apache.hadoop.yarn.util.resource.DominantResourceCalculator {code}
> 3) Currently this feature does not work with Yarn Node Manager recovery. We 
> may add recovery support in the future when we hook up with the right calls 
> in the recovery flow.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to