[ 
https://issues.apache.org/jira/browse/YARN-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561873#comment-13561873
 ] 

Chris Nauroth commented on YARN-316:
------------------------------------

Alejandro and Bikas, thanks for the feedback.  I'm working on a new patch, but 
here are a few quick replies first:

{quote}
On the 1st, agree, and i don't think the perf hit will be noticeable, the most 
a a few of millisecs.
{quote}

The next version of the patch will bundle the classpath into a temp jar on all 
platforms, not just Windows.

{quote}
If its not too much work, could you try using 
http://commons.apache.org/lang/api-release/org/apache/commons/lang3/text/StrSubstitutor.html
 instead of writing a custom string substitutor. Dont bother if its going to be 
a lot of effort.
{quote}

I had looked at StrSubstitutor earlier, but there were some things that make it 
awkward to use for this logic.

If the variable is undefined, then the StrSubstitutor will leave the variable 
name in place instead of the more traditional shell behavior of replacing it 
with empty string.  For example, consider "$FOO_$BAR_$BAZ" and an environment 
consisting of FOO=one and BAZ=two (BAR is undefined).  StrSubstitutor returns 
"one_$BAR_two" instead of "one__two", which is what we expect from shell.  To 
work around this, we'd need to wrap the environment map in a custom map that 
returns default values (i.e. Guava MapMaker) or subclass StrLookup: 
http://commons.apache.org/lang/api-2.5/org/apache/commons/lang/text/StrLookup.html.

The other problem is that StrSubstitutor works best for matching variable names 
that have a static prefix and suffix.  This works great for Windows 
("%VAR%/foo"), but now that we're going to do the same thing for non-Windows, 
we also need to handle shell variable names ("$VAR/foo").  We need to parse $, 
followed by multiple legal variable name characters, terminated by any 
non-legal variable name character.  That can't be expressed with a static 
suffix, but it's easily expressed with a regex.  Another alternative is to 
subclass StrMatcher: 
http://commons.apache.org/lang/api-2.5/org/apache/commons/lang/text/StrMatcher.html.

It's definitely possible to make StrSubstitutor behave the way we need, but all 
things considered, it would probably take at least double the code compared to 
{{StringUtils#replaceTokens}}.  I'm not planning on switching to StrSubstitutor 
in the next patch, but if you disagree, please let me know.

{quote}
What prompted this change in MiniYarnCluster?
{quote}

I forgot to mention this part.  At this point in the code, it's trying to 
create a directory at a deeply nested path, and the parent path doesn't exist 
yet.  mkdir() was returning false.  This wasn't causing test failures on Linux, 
because the directory was still getting created later during container 
initialization.  However, it is a problem on Windows with the temp test 
directory symlink, because winutils symlink currently requires that the target 
already exists.  (See HADOOP-9043.)  I switched this to mkdirs() so that it 
would recursively create the full path.

{quote}
Do we still need to use SimpleNames after using symlink?
{quote}

Yes, unfortunately, the symlink alone isn't sufficient.  Here is an example of 
the kind of test working directory it was using before my patch (390 
characters):

C:/hdc/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.ClusterMapReduceTestCaseConfigurableMiniMRCluster/org.apache.hadoop.mapred.ClusterMapReduceTestCaseConfigurableMiniMRCluster-localDir-nm-1_1/usercache/cnauroth/appcache/application_1358229151479_0001/container_1358229151479_0001_01_000001/default_container_executor.cmd

Using the temp symlink, that turns into this path (270 characters, still over 
the limit of 260):

C:\Users\cnauroth\AppData\Local\Temp\1358803955776\org.apache.hadoop.mapred.ClusterMapReduceTestCaseConfigurableMiniMRCluster-localDir-nm-1_1\usercache\cnauroth\appcache\application_1358229151479_0001\container_1358229151479_0001_01_000001\default_container_executor.cmd

Then, with the switch to simple class name, it fits (244 characters, bringing 
us under the limit of 260):

C:\Users\cnauroth\AppData\Local\Temp\1358803955776\ClusterMapReduceTestCaseConfigurableMiniMRCluster-localDir-nm-1_1\usercache\cnauroth\appcache\application_1358229151479_0001\container_1358229151479_0001_01_000001\default_container_executor.cmd

{quote}
Is there a JIRA to make the env substitution work for branch-1-win when 
creating the classpath manifest? What about * expansion?
{quote}

Thank you for the reminder.  I just filed MAPREDUCE-4959 to backport this logic 
to branch-1-win.  In my next version of this patch, I'm also going to try to 
refactor more of the logic currently in {{ContainerLaunch}} to 
{{FileUtil#createJarWithClassPath}}.  I expect that will make the code easier 
to backport to branch-1-win, because we'll have most of the logic in 
hadoop-common, and then it's just a matter of different call sites in MapReduce 
v1 vs. YARN.

                
> YARN container launch may exceed maximum Windows command line length due to 
> long classpath
> ------------------------------------------------------------------------------------------
>
>                 Key: YARN-316
>                 URL: https://issues.apache.org/jira/browse/YARN-316
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.0.0, trunk-win
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: YARN-316-branch-trunk-win.1.patch, 
> YARN-316-branch-trunk-win.2.patch
>
>
> On Windows, a command line longer than 8192 characters will fail.  This can 
> cause YARN container launch to fail on Windows if the classpath argument 
> exceeds this limit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to