[jira] [Comment Edited] (YARN-7654) Support ENTRY_POINT for docker container

Eric Yang (JIRA) Wed, 18 Apr 2018 15:57:41 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-7654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16443305#comment-16443305
 ]


Eric Yang edited comment on YARN-7654 at 4/18/18 10:56 PM:
-----------------------------------------------------------

For following up on today's meet up discussion:

Proposal 1: Environment variables are passed as a section in .cmd file, and 
docker run construct command line arguments to pass environment variables to 
docker.  In YARN-8079, user can reference to localized env-file from HDFS to 
support complex use-case where the software developer supplied default 
environment variables, and allow system administrator to override them.

Proposal 2: All environment key/value pair are written to a file in nmPrivate 
directory, and docker run reference to filename in nmPrivate directory.  For 
debug purpose, the same file in nmPrivate directory is copied to log directory 
for debugging purpose.

The current implementation is written to support proposal 1.

Proposal 1
| Pro | Cons |
| No additional file to clean up, environment variables can be seen from docker 
run command line | docker run command is a long line of string in stdout.txt, 
it may appear ugly to some users |
| Separation of duty, allow system administrator to run services and override 
developer supplied environment variables | |
| Can hide environment secrets from stdout.txt log file (YARN-8079) | 
Environment file is not copied to log directory |
| No filename clashes in localizer directory because user define the env-file 
filename | Possible abuse of sourcing other container's environment file by 
guessing exact file path.  (high degree of difficulty) |
| Save 1 inode for not duplicating env-file in log directory, if log 
aggregation is not enabled | |
| Extensible to support "docker run" to be launched as non-yarn user.  
yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user can be 
supported | |

Proposal 2
| Pro | Cons |
| Fewer code path to maintain | |
| | No separation of duty, only one way to define environment variables. |
| Hide environment secrets from stdout.txt log file | Environment file is 
copied to log directory, therefore environment secrets is still visible to yarn 
logs cli command. |
| | Possible limitation when docker run user != YARN user, nmPrivate directory 
is limited to yarn user. |
| | Referenced env-file is different path from what is displayed to user. It 
can be confusing to first time users. |
| | Pay 1 extra inode to duplicate env-file to log directory for debugging 
purpose. |

[~jlowe] [~ebadger] [~Jim_Brennan] [[email protected]] [~billie.rinaldi] I 
try to summarize the pro and cons of both possible implementations.  I think I 
could be bias toward proposal 1 because it is already implemented.  We can 
refine code to proposal 2 if we are willing to risk the limitation that docker 
run user  == root user, and root user always have access to nmPrivate 
directory.  I am not ready to make that commitment, this is the reason that I 
ask everyone to provide their input and we respect the majority rules on this 
matter.  Thank you for reviewing this patch.


was (Author: eyang):
For following up on today's meet up discussion:

Proposal 1: Environment variables are passed as a section in .cmd file, and 
docker run construct command line arguments to pass environment variables to 
docker.  In YARN-8079, user can reference to localized env-file from HDFS to 
support complex use-case where the software developer supplied default 
environment variables, and allow system administrator to override them.

Proposal 2: All environment key/value pair are written to a file in nmPrivate 
directory, and docker run reference to filename in nmPrivate directory.  For 
debug purpose, the same file in nmPrivate directory is copied to log directory 
for debugging purpose.

The current implementation is written to support proposal 1.

Proposal 1
| Pro | Cons |
| No additional file to clean up, environment variables can be seen from docker 
run command line | docker run command is a long line of string in stdout.txt, 
it may appear ugly to some users |
| Separation of duty, allow system administrator to run services and override 
developer supplied environment variables | |
| Can hide environment secrets from stdout.txt log file (YARN-8079) | 
Environment file is not copied to log directory |
| No filename clashes in localizer directory because user define the env-file 
filename | Possible abuse of sourcing other container's environment file by 
guessing exact file path.  (high degree of difficulty) |
| Save 1 inode for not duplicating env-file in log directory, if log 
aggregation is not enabled | |
| Extensible to support "docker run" to be launched as non-yarn user.  
yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user can be 
supported | |

Proposal 2
| Pro | Cons |
| Fewer code path to maintain | |
| | No separation of duty, only one way to define environment variables. |
| Hide environment secrets from stdout.txt log file | Environment file is 
copied to log directory, therefore environment secrets is still visible to yarn 
|
| | Possible limitation when docker run user != YARN user, nmPrivate directory 
is limited to yarn user. |
| | Referenced env-file is different path from what is displayed to user. It 
can be confusing to first time users. |
| | Pay 1 extra inode to duplicate env-file to log directory for debugging 
purpose. |

[~jlowe] [~ebadger] [~Jim_Brennan] [[email protected]] [~billie.rinaldi] I 
try to summarize the pro and cons of both possible implementations.  I think I 
could be bias toward proposal 1 because it is already implemented.  We can 
refine code to proposal 2 if we are willing to risk the limitation that docker 
run user  == root user, and root user always have access to nmPrivate 
directory.  I am not ready to make that commitment, this is the reason that I 
ask everyone to provide their input and we respect the majority rules on this 
matter.  Thank you for reviewing this patch.

> Support ENTRY_POINT for docker container
> ----------------------------------------
>
>                 Key: YARN-7654
>                 URL: https://issues.apache.org/jira/browse/YARN-7654
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>    Affects Versions: 3.1.0
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>            Priority: Blocker
>         Attachments: YARN-7654.001.patch, YARN-7654.002.patch, 
> YARN-7654.003.patch, YARN-7654.004.patch, YARN-7654.005.patch, 
> YARN-7654.006.patch, YARN-7654.007.patch, YARN-7654.008.patch, 
> YARN-7654.009.patch, YARN-7654.010.patch, YARN-7654.011.patch, 
> YARN-7654.012.patch, YARN-7654.013.patch, YARN-7654.014.patch, 
> YARN-7654.015.patch
>
>
> Docker image may have ENTRY_POINT predefined, but this is not supported in 
> the current implementation.  It would be nice if we can detect existence of 
> {{launch_command}} and base on this variable launch docker container in 
> different ways:
> h3. Launch command exists
> {code}
> docker run [image]:[version]
> docker exec [container_id] [launch_command]
> {code}
> h3. Use ENTRY_POINT
> {code}
> docker run [image]:[version]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (YARN-7654) Support ENTRY_POINT for docker container

Reply via email to