[ https://issues.apache.org/jira/browse/YARN-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049005#comment-15049005 ]
Sangjin Lee commented on YARN-3623: ----------------------------------- Thanks for your comment [~djp]. Per comments above, let's move the v.1-v.2 compatibility discussion to YARN-3196. I just want to clarify my comments to the extent it is relevant to this JIRA as I think they might have been misunderstood. {quote} I would question on this. If "yarn.timeline-service.version" is 2 (after cluster is being upgrade), and we don't serve 1/1.5 ATS service any more, how can existing running applications survival for timeline services? Unless we have a clear answer in v2 that we will continue to maintain a ATS v1/v1.5 service as a legacy daemon in v2 (I don't prefer this way), I don't think we should mark this config to indicate an unique version of ATS service running in the server side. {quote} When I said the cluster should bring up that exact version of the timeline service, I didn't mean that we will not support any compatibility. I definitely agree that the compatibility and support for a smooth rolling upgrade should be an objective, and that's why we want to continue the discussion and work on YARN-3196. What I meant to do is to separate the compatibility support (or rolling upgrade support) from the main interpretation of this config on the cluster. It is true that the main mode of operation will be on the version that is declared via timeline-service.version. Also, I think there are many options in the way we can implement the rolling upgrade support. Supporting rolling upgrade does not necessarily mean that the v.1 write/read endpoints must be up in parallel with the v.2 write/read endpoints. We talked about having some kind of a temporary proxy or something in the timeline client itself. There may be other ways, but we're not mandating that the old endpoints must be up to implement the rolling upgrade support. My point was that when we say timeline-service.version = 2 doesn't *automatically* mean we still must bring up end points of the previous version (or versions), as that's more of an implementation choice for how to support rolling upgrade. I hope that clarifies my earlier comments. {quote} This works if we don't consider rolling upgrade case. For roll up cases, an running application/framework cannot switch its client version config if YARN cluster is upgrading to a new version ATS. We shouldn't claim that application's clients is expected to be no response if version is mis-match with serve or the user would misunderstand they have to kill these applications after upgrade. Instead, we should claim that client is not supposed to override this config that vary with cluster config unless they are pretty sure what cluster side are doing (like upgrading process, etc.). {quote} Again, I hope it is clear what I meant was NOT that we will not consider the rolling upgrade use case. Even if the cluster is running with version = 2, with a proper rolling upgrade support, it should be prepared to handle (during the transition) calls that are coming in from running apps with version = 1 or 1.5. That's why I said "depending on how robust the compatibility story is". Let me know if this helps in any way. > We should have a config to indicate the Timeline Service version > ---------------------------------------------------------------- > > Key: YARN-3623 > URL: https://issues.apache.org/jira/browse/YARN-3623 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver > Reporter: Zhijie Shen > Assignee: Xuan Gong > Attachments: YARN-3623-2015-11-19.1.patch > > > So far RM, MR AM, DA AM added/changed new config to enable the feature to > write the timeline data to v2 server. It's good to have a YARN > timeline-service.version config like timeline-service.enable to indicate the > version of the running timeline service with the given YARN cluster. It's > beneficial for users to more smoothly move from v1 to v2, as they don't need > to change the existing config, but switch this config from v1 to v2. And each > framework doesn't need to have their own v1/v2 config. -- This message was sent by Atlassian JIRA (v6.3.4#6332)