[
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168104#comment-15168104
]
Li Lu commented on YARN-4700:
-----------------------------
[~sjlee0] I checked the flowrun activity table and I can see the row keys for
the same flow is like this:
{code}
Current count: 1, row:
yarn_cluster!\x7F\xFF\xFE\xAC\xEFl\xBB\xFF!llu!flow_1445894691726_1
...
Current count: 10, row:
yarn_cluster!\x7F\xFF\xFE\xAD7\x85\xC3\xFF!llu!flow_1445894691726_1
...
Current count: 19, row:
yarn_cluster!\x7F\xFF\xFE\xAD<\xAC\x1F\xFF!llu!flow_1445894691726_1
...
{code}
>From FlowActivityTable, I can see the row key contains inv_top_of_day, which
>is the reason we have duplicated flows. However, for each of the flows, I only
>ran it for once, and never touch them again. By design this should not
>regenerate any new flow activities? Is this related to RM restart?
> ATS storage has one extra record each time the RM got restarted
> ---------------------------------------------------------------
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Affects Versions: YARN-2928
> Reporter: Li Lu
> Assignee: Naganarasimha G R
> Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one
> extra record for each finished application (but still hold in the RM state
> store) each time the RM got restarted. It's quite possible that we add the
> cluster start timestamp into the default cluster id, thus each time we're
> creating a new record for one application (cluster id is a part of the row
> key). We need to fix this behavior, probably by having a better default
> cluster id.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)