[jira] [Commented] (YARN-5277) when localizers fail due to resource timestamps being out, provide more diagnostics

Steve Loughran (JIRA) Tue, 21 Jun 2016 02:16:33 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15341420#comment-15341420
 ]


Steve Loughran commented on YARN-5277:
--------------------------------------

existing stack.

I'm happy to provide a patch for this, provided I get a commitment from someone 
in the YARN team to actually review my patch. If not, I'm not going to bother.

{code}
java.io.IOException: Resource 
hdfs://clusterfs:8020/user/hrt_qa/.sparkStaging/application_1466445165023_0013/spark-assembly-1.6.1.jar
 changed on src filesystem (expected 1466447774453, was 1466447776952
        at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:255)
        at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

2016-06-20 18:36:29,988 INFO  container.ContainerImpl 
(ContainerImpl.java:handle(1163)) - Container 
container_e04_1466445165023_0013_01_000001 transitioned from LOCALIZING to 
LOCALIZATION_FAILED
{code}

> when localizers fail due to resource timestamps being out, provide more 
> diagnostics
> -----------------------------------------------------------------------------------
>
>                 Key: YARN-5277
>                 URL: https://issues.apache.org/jira/browse/YARN-5277
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>
> When an NM fails a resource D/L as the timestamps are wrong, there's not much 
> info, just two long values. 
> It would be good to also include the local time values, *and the current wall 
> time*. These are the things people need to know when trying to work out what 
> went wrong



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-5277) when localizers fail due to resource timestamps being out, provide more diagnostics

Reply via email to