[
https://issues.apache.org/jira/browse/YARN-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16032023#comment-16032023
]
Vrushali C commented on YARN-6323:
----------------------------------
Hmm, I have been thinking over this and I think we all discussed a bit in the
last weekly call too.
During upgrade, in any case, there won't be complete information for that flow
since some containers would have already finished, some might be running on
older nodes, some might start on newer ones.
The NM does not have the app name but needs to create a default flow context
upon restart. The only thing that I can see it can use is the app id.
We could put in a special case to drop the data in the writer if a particular
flow context is being used. What I mean is, when the NM restarts with atsv2
enabled for the first time and does not find an existing flow context, we
create a specific dummy flow context and we check for that in the writer. If it
matches this "drop data" flow context, we simply do not write the data to the
backend.
With YARN-6555, the work preserving restart will ensure that flow context is
written and thus will be available when the NM restarts at later occasions, so
the dummy flow context won't be used in the future cases.
> Rolling upgrade/config change is broken on timeline v2.
> --------------------------------------------------------
>
> Key: YARN-6323
> URL: https://issues.apache.org/jira/browse/YARN-6323
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Li Lu
> Assignee: Vrushali C
> Labels: yarn-5355-merge-blocker
> Attachments: YARN-6323.001.patch
>
>
> Found this issue when deploying on real clusters. If there are apps running
> when we enable timeline v2 (with work preserving restart enabled), node
> managers will fail to start due to missing app context data. We should
> probably assign some default names to these "left over" apps. I believe it's
> suboptimal to let users clean up the whole cluster before enabling timeline
> v2.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]