[
https://issues.apache.org/jira/browse/YARN-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14391617#comment-14391617
]
Zhijie Shen commented on YARN-3391:
-----------------------------------
Thanks for your input, Joep!
bq. Therefore it seems to me that adding the app_id to the flow_id by default
does not add any value,
Yeah, I agree it's not adding value by using the app_id, but IMHO, it also
doesn't add problem. Backing to the aforementioned example, if Sleep_...1 --
Sleep_...40 is using application name as the flow name, and Sleep_...41 --
Sleep_...50 is set explicitly to be part of flow XYZ, I'll get something weird
on web UI that "Sleep 40 runs cost 4/5 x". It misleads users that there're 40
sleep jobs instead of 50.
bq. Then are we thinking on the future RM UI, would we show 1 line for each:
Instead, showing this information sounds more like aggregating applications
according to application name/type. We can do the aggregation at these
dimensions, jus as aggregating them based on queue and so on.
> Clearly define flow ID/ flow run / flow version in API and storage
> ------------------------------------------------------------------
>
> Key: YARN-3391
> URL: https://issues.apache.org/jira/browse/YARN-3391
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Zhijie Shen
> Assignee: Zhijie Shen
> Attachments: YARN-3391.1.patch
>
>
> To continue the discussion in YARN-3040, let's figure out the best way to
> describe the flow.
> Some key issues that we need to conclude on:
> - How do we include the flow version in the context so that it gets passed
> into the collector and to the storage eventually?
> - Flow run id should be a number as opposed to a generic string?
> - Default behavior for the flow run id if it is missing (i.e. client did not
> set it)
> - How do we handle flow attributes in case of nested levels of flows?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)