[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776072#comment-13776072
 ] 

Siddharth Seth commented on YARN-1229:
--------------------------------------

I'm in favour of renaming the shuffle service id as well, and enforcing 
constraints on the names. Shell parameters apparently have name restrictions - 
http://stackoverflow.com/questions/2821043/allowed-characters-in-linux-environment-variable-names
 has some links to standards. Setting aux-service name restrictions based on 
shell name restrictions seems ok to me.

This is an incompatible change though. Sites which have Hadoop 2 (or 0.23) 
deployed would need to change their configs to reflect the shuffle service name 
update. (The shuffleService isn't started when using the default hadoop 
configuration files).

An alternate could be to use base32 encoding for the service name - but would 
prefer not going there.
                
> Shell$ExitCodeException could happen if AM fails to start
> ---------------------------------------------------------
>
>                 Key: YARN-1229
>                 URL: https://issues.apache.org/jira/browse/YARN-1229
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.1.1-beta
>            Reporter: Tassapol Athiapinya
>            Assignee: Xuan Gong
>            Priority: Blocker
>             Fix For: 2.1.1-beta
>
>
> I run sleep job. If AM fails to start, this exception could occur:
> 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
> state FAILED due to: Application application_1379673267098_0020 failed 1 
> times due to AM Container for appattempt_1379673267098_0020_000001 exited 
> with  exitCode: 1 due to: Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException: 
> /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_000001/launch_container.sh:
>  line 12: export: 
> `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
> ': not a valid identifier
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
> at org.apache.hadoop.util.Shell.run(Shell.java:379)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to