[
https://issues.apache.org/jira/browse/YARN-5095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sangjin Lee updated YARN-5095:
------------------------------
Comment: was deleted
(was: I'm not sure if this is related but I also see this log in the RM log:
{noformat}
2016-05-16 14:19:29,930 ERROR
org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollector:
Error aggregating timeline metrics
java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.timelineservice.storage.common.Separator.joinEncoded(Separator.java:249)
at
org.apache.hadoop.yarn.server.timelineservice.storage.application.ApplicationRowKey.getRowKey(ApplicationRowKey.java:110)
at
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineWriterImpl.write(HBaseTimelineWriterImpl.java:131)
at
org.apache.hadoop.yarn.server.timelineservice.collector.AppLevelTimelineCollector$AppLevelAggregator.run(AppLevelTimelineCollector.java:136)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
{noformat}
It's quite possible this is a separate issue.)
> flow activities and flow runs are populated with wrong timestamp when RM
> restarts w/ recovery enabled
> -----------------------------------------------------------------------------------------------------
>
> Key: YARN-5095
> URL: https://issues.apache.org/jira/browse/YARN-5095
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Affects Versions: YARN-2928
> Reporter: Sangjin Lee
> Priority: Critical
> Labels: yarn-2928-1st-milestone
>
> I have the RM recovery enabled. I see that upon restart the RM populates
> records into flow activity and flow runs but with *wrong* timestamps. What I
> mean by the timestamp is the part of the row key:
> - flow activity: row created with the day of the RM restart
> - flow run: row created with the RM start time as the "run id"
> The following illustrates an example flow run:
> {noformat}
> metrics: [ ],
> events: [ ],
> id: "sjlee@Sleep job/1463433569917",
> type: "YARN_FLOW_RUN",
> createdtime: 1463422860987,
> info: {
> UID: "yarn_cluster!sjlee!Sleep job!1463433569917",
> SYSTEM_INFO_FLOW_RUN_ID: 1463433569917,
> SYSTEM_INFO_FLOW_NAME: "Sleep job",
> SYSTEM_INFO_FLOW_RUN_END_TIME: 1463422865033,
> SYSTEM_INFO_USER: "sjlee"
> },
> isrelatedto: { },
> relatesto: { }
> {noformat}
> The created time and the end time are correct (i.e. original time), whereas
> the timestamp in the row key (= run id: 1463433569917) is actually later than
> the end time and coincides with the RM restart.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]