[ 
https://issues.apache.org/jira/browse/YARN-7654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469613#comment-16469613
 ] 

Jason Lowe commented on YARN-7654:
----------------------------------

In the interest in trying to get this into 3.1, I'm OK with going with the 
foreground approach for entry point for now as long as the pre-existing 
containers launch as before (which appears to be the case with the patch).  
Changing the entry-point launch logic is something we should be able to update 
in the future since it's details are hidden from the user.

Comments on the latest patch:

In DockerProviderService#buildContainerLaunchContext it's calling 
processArtifact then super.buildContainerLaunchContext, but the parent's 
buildContainerLaunchContext calls processArtifact as well.  Is the double-call 
intentional?

It looks like the parent is called but in the end the result is smashed by the 
new setCommand method, and much of the parent method's code was replicated in 
this method.  Rather than requiring a setCommand interface be added to 
AbstractLauncher to clobber work previously done, would it make more sense to 
refactor AbstractProviderService#buildContainerLaunchContext so the pieces 
needed by DockerProviderService can be reused without requiring the launcher 
command to be clobbered afterwards?

globalTokens and tokensForSubstitution are unconditionally computed but only 
needed if the launchCommand is empty.  The computation should be moved inside 
the conditional.

typo: "Base on discussion" s/b "Based on the discussion"

ENV_DOCKER_COTAINER_ENV_FILE is not used and should be removed.

DockerLinuxContainerRuntime imports ENV_DOCKER_CONTAINER_RUN_OVERRIDE_DISABLE 
from itself.
dockerOverride is true when the override disable flag is true which is 
confusing.  That makes me think dockerOverride actually means "use entry point" 
but the name implies it is true when we are overriding docker and _not_ using 
the entry point.  A name like useEntryPoint would make the code much easier to 
follow if that's indeed what the boolean means.  One example of confusing code 
that looks like the logic is backwards:
{code}
    if (dockerOverride) {
       LOG.info("command override disabled");
       runCommand.setOverrideDisabled(true);
{code}

Nit: The patch is doing the double-env lookup pattern again.  Would be helpful 
to have a utility method like getEnvBoolean that can be used to lookup 
environment variables whose values are treated like booleans.

Is it necessary to do the following in isolation or can it be folded into the 
primary conditional that does entry point logic vs. override logic a bit below?
{code}
    if (!dockerOverride) {
      runCommand.setContainerWorkDir(containerWorkDir.toString());
    }
{code}

Should we order the environment by dependencies as we do for normal container 
launches?  I see DockerRunCommand is using a TreeMap, but would a LinkedHashMap 
make more sense?  Then we're ordering things based on the order they were added.

DockerClient is creating the environment file in /tmp which has the same 
leaking problem we had with the docker .cmd files.

writeEnvFile should use try-with-resources to make sure the writer is always 
closed even when something throws.

writeCommandToTempFile should use try-with-resources so the printWriter is 
always closed rather than placing explicit close calls on certain errors.

The instance checking and downcasting in writeCommandToTempFile looks pretty 
ugly.  It would be cleaner to encapsulate this in the DockerCommand 
abstraction.  One example way to do this is to move the logic of writing a 
docker command file into the DockerCommand abstract class.  DockerRunCommand 
can then override that method to call the parent method and then separately 
write the env file.  Worst case we can add a getEnv method to DockerCommand 
that returns the collection of environment variables to write out for a 
command.  DockerCommand would return null or an empty collection while 
DockerRunCommand can return its environment.

There's no need to have a containsEnv method and a getEnv method, especially 
when getEnv is cheap.

Nit: It would be nice if DockerRunCommand had a method to add a Collection of 
environment variables rather than forcing the caller to iterate themselves.

Nit: The following should just be {{String value = Boolean.toString(toggle)}}
{code}
    String value = "false";
    if (toggle) {
      value = "true";
    }
{code}

init_log_path should free tmp_buffer before setting it to NULL.

use_entry_point returns false when the entry point should be used.

The code is now writing "Launching docker container..." etc. even when not 
using the entry point.  Are these smashed by the container_launch.sh script 
when not using the entry point?  If not it could be an issue since it's 
changing what the user's code is writing to those files today.

It would be be nice to factor out the entry-point-specific code path to do the 
docker inspect with retries logic into a separate function for readability.  It 
looks like it would come out pretty cleanly.



> Support ENTRY_POINT for docker container
> ----------------------------------------
>
>                 Key: YARN-7654
>                 URL: https://issues.apache.org/jira/browse/YARN-7654
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>    Affects Versions: 3.1.0
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>            Priority: Blocker
>              Labels: Docker
>         Attachments: YARN-7654.001.patch, YARN-7654.002.patch, 
> YARN-7654.003.patch, YARN-7654.004.patch, YARN-7654.005.patch, 
> YARN-7654.006.patch, YARN-7654.007.patch, YARN-7654.008.patch, 
> YARN-7654.009.patch, YARN-7654.010.patch, YARN-7654.011.patch, 
> YARN-7654.012.patch, YARN-7654.013.patch, YARN-7654.014.patch, 
> YARN-7654.015.patch, YARN-7654.016.patch, YARN-7654.017.patch, 
> YARN-7654.018.patch, YARN-7654.019.patch, YARN-7654.020.patch, 
> YARN-7654.021.patch
>
>
> Docker image may have ENTRY_POINT predefined, but this is not supported in 
> the current implementation.  It would be nice if we can detect existence of 
> {{launch_command}} and base on this variable launch docker container in 
> different ways:
> h3. Launch command exists
> {code}
> docker run [image]:[version]
> docker exec [container_id] [launch_command]
> {code}
> h3. Use ENTRY_POINT
> {code}
> docker run [image]:[version]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to