[
https://issues.apache.org/jira/browse/YARN-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohith Sharma K S updated YARN-6827:
------------------------------------
Description:
While recovering application, it is observed that NPE exception is thrown as
below.
{noformat}
017-07-13 14:08:12,476 ERROR
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher:
Error when publishing entity [YARN_APPLICATION,application_1499929227397_0001]
java.lang.NullPointerException
at
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:178)
at
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.putEntity(TimelineServiceV1Publisher.java:368)
{noformat}
This is because in RM service start, active services are started first in Non
HA case and later ATS services are started. In HA case, tansitionToActive event
has come first before ATS service are started.
This gives sufficient time to active services recover the applications which
tries to publish into ATS while recovering. Since ATS services are not started
yet, it throws NPE.
was:
While recovering application, it is observed that NPE exception is thrown as
below.
{noformat}
017-07-13 14:08:12,476 ERROR
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher:
Error when publishing entity [YARN_APPLICATION,application_1499929227397_0001]
java.lang.NullPointerException
at
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:178)
at
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.putEntity(TimelineServiceV1Publisher.java:368)
{noformat}
This is because in RM service creation, active services are created first and
later ATS services are created. It means active services are started and ATS
services are started later point of time.
This gives sufficient time to active services recover the applications which
tries to publish into ATS while recovering. Since ATS services are not started
yet, it throws NPE.
> [ATS1/1.5] NPE exception while publishing recovering applications into ATS
> during RM restart.
> ---------------------------------------------------------------------------------------------
>
> Key: YARN-6827
> URL: https://issues.apache.org/jira/browse/YARN-6827
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Reporter: Rohith Sharma K S
> Assignee: Rohith Sharma K S
>
> While recovering application, it is observed that NPE exception is thrown as
> below.
> {noformat}
> 017-07-13 14:08:12,476 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher:
> Error when publishing entity
> [YARN_APPLICATION,application_1499929227397_0001]
> java.lang.NullPointerException
> at
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:178)
> at
> org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.putEntity(TimelineServiceV1Publisher.java:368)
> {noformat}
> This is because in RM service start, active services are started first in Non
> HA case and later ATS services are started. In HA case, tansitionToActive
> event has come first before ATS service are started.
> This gives sufficient time to active services recover the applications which
> tries to publish into ATS while recovering. Since ATS services are not
> started yet, it throws NPE.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]