[jira] [Commented] (YARN-5673) [Umbrella] Re-write container-executor to improve security, extensibility, and portability

Miklos Szegedi (JIRA) Wed, 07 Dec 2016 11:54:06 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15729736#comment-15729736
 ]


Miklos Szegedi commented on YARN-5673:
--------------------------------------

Thank you [~vvasudev] for the quick and detailed response! I really appreciate 
it.
{quote}
All of these binaries will require the setuid bit to be a set which means 
administrators will have to set permissions and manage 4 binaries. We also have 
to worry about 4 binaries that can have privilege escalation as opposed to one 
- any hot fixes for example will require all 4 binaries to be updated as 
opposed to just one. Interestingly you feel that administrator overhead of 
managing 4 binaries is worth it whereas some folks would prefer it the other 
way round . Do other folks feel that the multiple binaries approach is the way 
to go?
{quote}
Yes, I absolutely agree that this is a preference question. I am not sure about 
the ratios though. In terms of overhead, the administrator has to enable the 
modules in the configuration anyways. What I thought is that it is easier to 
set the permissions using familiar Unix tools rather than looking up the 
configuration files, reading the documentation about them and enabling the 
required modules with the right format. I have seen in the past issues with too 
many spaces for example.
However, please take some time to answer this one question. Let's assume, we 
were about to design /usr/bin/at, /usr/bin/sudo and /user/bin/passwd. They have 
about the same difference as container launching and mounting cgroups. Would 
you design them as separate tools, or as a single binary that loads them 
separately as modules based on a configuration file and command line options?
{quote}
We also have to worry about 4 binaries that can have privilege escalation as 
opposed to one
{quote}
I think the risk of privilege escalation is proportional to the amount of code 
rather than the amount of binaries, so it is about the same. On the other hand 
packing multiple functions into the same memory space may increase the sum of 
the individual risks in case of native code.
{quote}
any hot fixes for example will require all 4 binaries to be updated as opposed 
to just one. 
{quote}
(I assume that the code will be super stable :-) ...)
This was actually one question that I raised. What is the common code among 
features separated into modules? Only if common functionality is broken, it 
needs to be patched. I think this would be limited to auditing, logging, and 
maybe some filesystem operations that can be linked to the tools.
{quote}
Fair point. The idea here is that -
(1) Administrators will not add arbitrary modules to the module list.
(2) The posix-container-executor will give up all privileges before loading the 
modules which don't require administrator privileges
(3) Give administrators an option to turn off modules that require 
administrator privileges.
Would these help mitigate your concerns? The issue with the current setup is 
that there is no clean way to enable/disable functionality that administrators 
do not want enabled on their cluster.
{quote}
I agree, this is an issue in the current setup and yes, I think these are the 
right design decisions. Just as a side note, I prefer privileged modules be 
disabled by default for security and supportability reasons.
{quote}
Do you have some scenarios where container launch time has been an issue? The 
security aspects of a long running process versus one which is invoked on 
demand are different as well.
{quote}
I just wanted to discuss this design option early before much coding has 
started. If we want to use Yarn not just for long batch processing but for lots 
of quick requests in the future, launch time is an issue. I thought I raise 
pipe as an other option communicating the commands together with command line, 
file, and environment variables.

> [Umbrella] Re-write container-executor to improve security, extensibility, 
> and portability
> ------------------------------------------------------------------------------------------
>
>                 Key: YARN-5673
>                 URL: https://issues.apache.org/jira/browse/YARN-5673
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>            Reporter: Varun Vasudev
>            Assignee: Varun Vasudev
>         Attachments: container-executor Re-write Design Document.pdf
>
>
> As YARN adds support for new features that require administrator 
> privileges(such as support for network throttling and docker), we’ve had to 
> add new capabilities to the container-executor. This has led to a recognition 
> that the current container-executor security features as well as the code 
> could be improved. The current code is fragile and it’s hard to add new 
> features without causing regressions. Some of the improvements that need to 
> be made are -
> *Security*
> Currently the container-executor has limited security features. It relies 
> primarily on the permissions set on the binary but does little additional 
> security beyond that. There are few outstanding issues today -
> - No audit log
> - No way to disable features - network throttling and docker support are 
> built in and there’s no way to turn them off at a container-executor level
> - Code can be improved - a lot of the code switches users back and forth in 
> an arbitrary manner
> - No input validation - the paths, and files provided at invocation are not 
> validated or required to be in some specific location
> - No signing functionality - there is no way to enforce that the binary was 
> invoked by the NM and not by any other process
> *Code Issues*
> The code layout and implementation themselves can be improved. Some issues 
> there are -
> - No support for log levels - everything is logged and this can’t be turned 
> on or off
> - Extremely long set of invocation parameters(specifically during container 
> launch) which makes turning features on or off complicated
> - Poor test coverage - it’s easy to introduce regressions today due to the 
> lack of a proper test setup
> - Duplicate functionality - there is some amount of code duplication
> - Hard to make improvements or add new features due to the issues raised above
> *Portability*
>  - The container-executor mixes platform dependent APIs with platform 
> independent APIs making it hard to run it on multiple platforms. Allowing it 
> to run on multiple platforms also improves the overall code structure .
> One option is to improve the existing container-executor, however it might be 
> easier to start from scratch. That allows existing functionality to be 
> supported until we are ready to switch to the new code.
> This umbrella JIRA is to capture all the work required for the new code. I'm 
> going to work on a design doc for the changes - any suggestions or 
> improvements are welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-5673) [Umbrella] Re-write container-executor to improve security, extensibility, and portability

Reply via email to