shanyu zhao created YARN-9834:
---------------------------------
Summary: Allow using a pool of local users to run Yarn Secure
Container in secure mode
Key: YARN-9834
URL: https://issues.apache.org/jira/browse/YARN-9834
Project: Hadoop YARN
Issue Type: Bug
Components: nodemanager
Affects Versions: 3.1.2
Reporter: shanyu zhao
Yarn Secure Container in secure mode allows separation of different user's
local files and container processes running on the same node manager. This
depends on an out of band service such as SSSD/Winbind to sync all domain users
to local machine.
SSSD/Winbind user sync has lots of overhead, especially for large corporations.
Also if running Yarn inside Kubernetes cluster (meaning node managers running
inside Docker container), it doesn't make sense for each container to sync a
whole copy of domain users.
We should allow a new configuration to Yarn, such that we can pre-create a pool
of users on each machine/Docker container. And at runtime, Yarn allocates a
local user to the domain user that submits the application. When all containers
of that user and all files belonging to that user are deleted, we can release
the allocation and allow other users to use the same local user to run their
Yarn containers.
We propose to add these new configurations:
{code:java}
yarn.nodemanager.linux-container-executor.secure-mode.use-local-user, defaults
to false
yarn.nodemanager.linux-container-executor.secure-mode.local-user-prefix,
defaults to "user"{code}
If we enable this feature, with local-user-prefix set to "user", then we expect
there are pre-created local users user0 - usern, where n equals to:
{code:java}
yarn.nodemanager.resource.cpu-vcores {code}
We can use an in-memory allocator to keep the domain user to local user mapping.
Limitations:
1) This feature does not support PRIVATE type of resource allocation. Because
PRIVATE type of resources are potentially cached in the node manager for a very
long time, supporting it will be a security problem that a user might be able
to peek into previous user's PRIVATE resources. We can modify code to treat all
PRIVATE type of resource as APPLICATION.
2) It is recommended to enable DominantResourceCalculator so that no more than
"cpu-vcores" number of concurrent containers running on a node manager:
{code:java}
yarn.scheduler.capacity.resource-calculator
= org.apache.hadoop.yarn.util.resource.DominantResourceCalculator {code}
3) Currently this feature does not work with Yarn Node Manager recovery. We may
add recovery support in the future when we hook up with the right calls in the
recovery flow.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]