[ 
https://issues.apache.org/jira/browse/YARN-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14710000#comment-14710000
 ] 

Varun Saxena commented on YARN-3011:
------------------------------------

[~djp], sorry had missed your comment.
I was under a similar impression when I wrote the comment in January.

But actually all daemons including node manager set 
yarn.dispatcher.exit-on-error configuration explicitly to true in serviceInit. 
{code}
conf.setBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY, true);
{code}

That means the configuration value is completely disregarded.
The default value of false is meant for test cases to avoid JVM exit. This is 
clearly documented in Dispatcher.java. This configuration being an internal 
configuration is not included in yarn-default.xml either.
{code}
  // Configuration to make sure dispatcher crashes but doesn't do system-exit in
  // case of errors. By default, it should be false, so that tests are not
  // affected. For all daemons it should be explicitly set to true so that
  // daemons can crash instead of hanging around.
  public static final String DISPATCHER_EXIT_ON_ERROR_KEY =
      "yarn.dispatcher.exit-on-error";
{code}

We can probably set this config to true in daemons only if 
yarn.dispatcher.exit-on-error config is not set in config file. Thoughts ?
But is there any real use case for it ? A recoverable exception should be 
caught and handled and NOT leaked through to AsyncDispatcher. And a non 
recoverable one should lead to a crash anyways.
cc [~djp], [~jianhe]

> NM dies because of the failure of resource localization
> -------------------------------------------------------
>
>                 Key: YARN-3011
>                 URL: https://issues.apache.org/jira/browse/YARN-3011
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.5.1
>            Reporter: Wang Hao
>            Assignee: Varun Saxena
>              Labels: 2.6.1-candidate
>             Fix For: 2.7.0
>
>         Attachments: YARN-3011.001.patch, YARN-3011.002.patch, 
> YARN-3011.003.patch, YARN-3011.004.patch
>
>
> NM dies because of IllegalArgumentException when localize resource.
> 2014-12-29 13:43:58,699 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>  Downloading public rsrc:{ 
> hdfs://hadoop002.dx.momo.com:8020/user/hadoop/share/lib/oozie/json-simple-1.1.jar,
>  1416997035456, FILE, null }
> 2014-12-29 13:43:58,699 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>  Downloading public rsrc:{ 
> hdfs://hadoop002.dx.momo.com:8020/user/hive/src/final_test_ooize/test_ooize_job1.sql/,
>  1419831474153, FILE, null }
> 2014-12-29 13:43:58,701 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.lang.IllegalArgumentException: Can not create a Path from an empty string
>         at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
>         at org.apache.hadoop.fs.Path.<init>(Path.java:135)
>         at org.apache.hadoop.fs.Path.<init>(Path.java:94)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.getPathForLocalization(LocalResourcesTrackerImpl.java:420)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:758)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:672)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:614)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)  
>       
>         at java.lang.Thread.run(Thread.java:745)
> 2014-12-29 13:43:58,701 INFO 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: 
> Initializing user hadoop
> 2014-12-29 13:43:58,702 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Exiting, bbye..
> 2014-12-29 13:43:58,704 INFO org.apache.hadoop.mapred.ShuffleHandler: Setting 
> connection close header...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to