[
https://issues.apache.org/jira/browse/YARN-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594743#comment-16594743
]
Abhishek Modi commented on YARN-8535:
-------------------------------------
This issue is happening because Distributed Shell AM is not able to publish
events to Timeline Service. This started happening after YARN-3562. Reason is
after this change, NM Collector service starts to listen at random port and
updates it in the conf but PerNodeTimelineCollector service is an auxiliary
service and has it's own copy of conf in which the address of NM collector
service never gets updated.
> DistributedShell unit tests are failing
> ---------------------------------------
>
> Key: YARN-8535
> URL: https://issues.apache.org/jira/browse/YARN-8535
> Project: Hadoop YARN
> Issue Type: Bug
> Components: distributed-shell, timelineservice
> Reporter: Eric Yang
> Assignee: Abhishek Modi
> Priority: Major
>
> These tests have been failing for a while in trunk:
> |[testDSShellWithoutDomainV2|https://builds.apache.org/job/PreCommit-YARN-Build/21243/testReport/org.apache.hadoop.yarn.applications.distributedshell/TestDistributedShell/testDSShellWithoutDomainV2]|1
> min 20 sec|Failed|
> |[testDSShellWithoutDomainV2CustomizedFlow|https://builds.apache.org/job/PreCommit-YARN-Build/21243/testReport/org.apache.hadoop.yarn.applications.distributedshell/TestDistributedShell/testDSShellWithoutDomainV2CustomizedFlow]|1
> min 20 sec|Failed|
> |[testDSShellWithoutDomainV2DefaultFlow|https://builds.apache.org/job/PreCommit-YARN-Build/21243/testReport/org.apache.hadoop.yarn.applications.distributedshell/TestDistributedShell/testDSShellWithoutDomainV2DefaultFlow]|1
> min 20 sec|Failed|
> The root causes are the same:
> {code:java}
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.verifyEntityTypeFileExists(TestDistributedShell.java:628)
> at
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.checkTimelineV2(TestDistributedShell.java:546)
> at
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:451)
> at
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:310)
> at
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithoutDomainV2(TestDistributedShell.java:306)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74){code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]