[ 
https://issues.apache.org/jira/browse/YARN-9585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-9585:
----------------------------------
    Description: 
Transitive dependencies amongst environment variables set up inside the 
ContainerLaunchContext cannot be guaranteed to be resolved consistently during 
the setup of the container launch script.

For example, when environment variables are set in the following order inside 
the ContainerLaunchContext:

{code}
A_HOME=/path/to/A
PATH=$A_HOME:/path/to/somethingelse
{code}

the "export" inside the launch_container.sh may not happen in the same order 
which would mean that the launch_container.sh may look like this:

{code}
#!/bin/bash

…
echo "Setting up env variables"
…
export PATH=$A_HOME:/path/to/somethingelse
export A_HOME=/path/to/A

whereas the expectation would be:

export A_HOME=/path/to/A
export PATH=/path/to/A:/path/to/somethingelse
{code}

The explanation of this behaviour is as per below:

>From [1] and [2], we can see that the environment variables are getting 
>exported inside the launch container script by the Java code. The key thing to 
>note here is that the environment variables are retrieved from inside a "Map". 
> From [3], [4] and [5], this map is a HashMap and from [6], the order of 
>entries (env vars in this case) cannot be guaranteed.

Therefore, the "export" order of the environment variables inside the 
launch_container.sh script cannot be guaranteed as a result based on the 
current code implementation.

I have raised this JIRA after discussion with [~wilfreds] to look into the 
improvement of this behaviour in the future.

[1] 
https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java#L353
[2] 
https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java#L1200
[3] 
https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java#L203
[4] 
https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ContainerLaunchContextPBImpl.java#L383
[5] 
https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ContainerLaunchContextPBImpl.java#L393
[6] https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html

  was:
Transitive dependencies amongst environment variables set up inside the 
ContainerLaunchContext cannot be guaranteed to be resolved consistently during 
the setup of the container launch script.

For example, when environment variables are set in the following order inside 
the ContainerLaunchContext:

{code}
A_HOME=/path/to/A
PATH=$A_HOME:/path/to/somethingelse
{code}

the "export" inside the launch_container.sh may not happen in the same order 
which would mean that the launch_container.sh may look like this:

{code}
#!/bin/bash

…
echo "Setting up env variables"
…
export PATH=$A_HOME:/path/to/somethingelse
export A_HOME=/path/to/A

whereas the expectation would be:

export A_HOME=/path/to/A
export PATH=/path/to/A:/path/to/somethingelse
{code}

The explanation of this behaviour is as per below:

>From [1] and [2], we can see that the environment variables are getting 
>exported inside the launch container script by the Java code. The key thing to 
>note here is that the environment variables are retrieved from inside a "Map". 
> From [3], [4] and [5], this map is a HashMap and from [6], the order of 
>entries (env vars in this case) cannot be guaranteed.

Therefore, the "export" order of the environment variables inside the 
launch_container.sh script cannot be guaranteed as a result.

As such, this explains the inconsistent nature of the working of transitive 
dependencies.

It does not seem that transitive dependencies can be handled correctly based on 
the current code implementation, as such, I have gone ahead and raised this 
JIRA after discussion with [~wilfreds] to look into this improvement in the 
future.

[1] 
https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java#L353
[2] 
https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java#L1200
[3] 
https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java#L203
[4] 
https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ContainerLaunchContextPBImpl.java#L383
[5] 
https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ContainerLaunchContextPBImpl.java#L393
[6] https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html


> Order of environment variables set up inside ContainerLaunchContext cannot be 
> guaranteed inside the launch_container.sh resulting in inconsistent 
> resolution of transitive environment variables
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-9585
>                 URL: https://issues.apache.org/jira/browse/YARN-9585
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>            Reporter: Siddharth Ahuja
>            Priority: Major
>
> Transitive dependencies amongst environment variables set up inside the 
> ContainerLaunchContext cannot be guaranteed to be resolved consistently 
> during the setup of the container launch script.
> For example, when environment variables are set in the following order inside 
> the ContainerLaunchContext:
> {code}
> A_HOME=/path/to/A
> PATH=$A_HOME:/path/to/somethingelse
> {code}
> the "export" inside the launch_container.sh may not happen in the same order 
> which would mean that the launch_container.sh may look like this:
> {code}
> #!/bin/bash
> …
> echo "Setting up env variables"
> …
> export PATH=$A_HOME:/path/to/somethingelse
> export A_HOME=/path/to/A
> whereas the expectation would be:
> export A_HOME=/path/to/A
> export PATH=/path/to/A:/path/to/somethingelse
> {code}
> The explanation of this behaviour is as per below:
> From [1] and [2], we can see that the environment variables are getting 
> exported inside the launch container script by the Java code. The key thing 
> to note here is that the environment variables are retrieved from inside a 
> "Map".  From [3], [4] and [5], this map is a HashMap and from [6], the order 
> of entries (env vars in this case) cannot be guaranteed.
> Therefore, the "export" order of the environment variables inside the 
> launch_container.sh script cannot be guaranteed as a result based on the 
> current code implementation.
> I have raised this JIRA after discussion with [~wilfreds] to look into the 
> improvement of this behaviour in the future.
> [1] 
> https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java#L353
> [2] 
> https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java#L1200
> [3] 
> https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java#L203
> [4] 
> https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ContainerLaunchContextPBImpl.java#L383
> [5] 
> https://github.com/apache/hadoop/blob/release-3.0.0-RC0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ContainerLaunchContextPBImpl.java#L393
> [6] https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to