[ 
https://issues.apache.org/jira/browse/YARN-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567573#comment-14567573
 ] 

Junping Du commented on YARN-3699:
----------------------------------

Hi [~jrottinghuis] and [~vrushalic], thanks for your comments and sorry for 
replying late on this as traveling last week. 
I fully agree with Joep's above comments that there is no right or wrong schema 
but just fit-in one for priority scenarios:
- if we need more for flow_run under specific flow/flows, then making flow 
version as column will make this query more efficient.
- if we equally (or more) need for flow_run under specific flow version(s), 
then our decision here could be different.
To me, the tricky/interesting part here is the boundary between different flows 
and flow versions could vague in practice: How big/small changes we made on a 
flow should start a new flow or new flow version? Why we have more active flow 
versions instead of having only one active flow version (with adding more 
flows). These trade-offs in application concepts also affect our trade-off in 
schema design which is pretty common thing that I saw also from other apps.
I would like to trust your priority here given your experience from hRaven 
which is already in production running well for years. So I agree Phoenix 
schema should be adjusted slightly to get closed to HBase one. 
May be we should have a new JIRA for this (Phoenix schema) change? We can 
either keep this JIRA open for discussion or resolve it as later so in future, 
if others from community bring other solid scenarios in practice, we can 
continue the discussion here and try to make better trade-off or innovation. 
Thoughts?

> Decide if  flow version should be part of row key or column
> -----------------------------------------------------------
>
>                 Key: YARN-3699
>                 URL: https://issues.apache.org/jira/browse/YARN-3699
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Vrushali C
>
> Based on discussions in YARN-3411 with [~djp], filing jira for continuing 
> discussion on putting the flow version in rowkey or column. 
> Either phoenix/hbase approach will update the jira with the conclusions..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to