[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168104#comment-15168104
 ] 

Li Lu commented on YARN-4700:
-----------------------------

[~sjlee0] I checked the flowrun activity table and I can see the row keys for 
the same flow is like this:
{code}
Current count: 1, row: 
yarn_cluster!\x7F\xFF\xFE\xAC\xEFl\xBB\xFF!llu!flow_1445894691726_1
...
Current count: 10, row: 
yarn_cluster!\x7F\xFF\xFE\xAD7\x85\xC3\xFF!llu!flow_1445894691726_1
...
Current count: 19, row: 
yarn_cluster!\x7F\xFF\xFE\xAD<\xAC\x1F\xFF!llu!flow_1445894691726_1
...
{code}

>From FlowActivityTable, I can see the row key contains inv_top_of_day, which 
>is the reason we have duplicated flows. However, for each of the flows, I only 
>ran it for once, and never touch them again. By design this should not 
>regenerate any new flow activities? Is this related to RM restart? 

> ATS storage has one extra record each time the RM got restarted
> ---------------------------------------------------------------
>
>                 Key: YARN-4700
>                 URL: https://issues.apache.org/jira/browse/YARN-4700
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Li Lu
>            Assignee: Naganarasimha G R
>              Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to