Miklos Szegedi commented on YARN-5714:

Thank you, for the patch, [~rcatherinot]!
1. "%%A%%" should not match any variable in Windows but the current patch 
returns "A". Double %% is replaced with single %
2. This is a potential double increase of i without checking i < l
994                 if (c == '\\') {
995                   i++;
996                 }
997                 if (c == '\'') {
998                   break;
999                 }
1000                i++;
3. There is a typo in this sentence:
855           // 2 - we need a map implementation that support entries to be

4. You have an iteration that has a depth of 3. This will run at every 
container launch, so it might cause perf issues. In particular 
{code}copy.containsKey(envDep){code} will be called again and again (6 times?) 
until all dependencies are added, if a set is ordered like this:
D=$A $B $C

Have you considered an algorithm, where you scan the dependencies only once? 
You may not even need the hash map for it in this case.
for (env: envsI)
  env = stack.pop()
  if (!env.marked)
    if (env.dependencies.empty)
      for(dep : end.dependencies())
5. It might be slower (needs to be tested) but have you considered regex for 
parsing? It would make the code shorter and easier to understand.

> ContainerExecutor does not order environment map
> ------------------------------------------------
>                 Key: YARN-5714
>                 URL: https://issues.apache.org/jira/browse/YARN-5714
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.4.1, 2.5.2, 2.7.3, 2.6.4, 3.0.0-alpha1
>         Environment: all (linux and windows alike)
>            Reporter: Remi Catherinot
>            Assignee: Remi Catherinot
>            Priority: Trivial
>              Labels: oct16-medium
>         Attachments: YARN-5714.001.patch, YARN-5714.002.patch, 
> YARN-5714.003.patch, YARN-5714.004.patch
>   Original Estimate: 120h
>  Remaining Estimate: 120h
> when dumping the launch container script, environment variables are dumped 
> based on the order internally used by the map implementation (hash based). It 
> does not take into consideration that some env varibales may refer each 
> other, and so that some env variables must be declared before those 
> referencing them.
> In my case, i ended up having LD_LIBRARY_PATH which was depending on 
> HADOOP_COMMON_HOME being dumped before HADOOP_COMMON_HOME. Thus it had a 
> wrong value and so native libraries weren't loaded. jobs were running but not 
> at their best efficiency. This is just a use case falling into that bug, but 
> i'm sure others may happen as well.
> I already have a patch running in my production environment, i just estimate 
> to 5 days for packaging the patch in the right fashion for JIRA + try my best 
> to add tests.
> Note : the patch is not OS aware with a default empty implementation. I will 
> only implement the unix version on a 1st release. I'm not used to windows env 
> variables syntax so it will take me more time/research for it.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to