[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575059#comment-14575059
 ] 

Vrushali C commented on YARN-2928:
----------------------------------


Hi [~jamestaylor]

Thank you for taking the time to look through the write up and for filing  
PHOENIX-2028.

In the context of pre-splits, yes, we wanted to have both writers write to 
tables that were pre-split with the same presplit strategy. However, I believe 
the folks working on the Phoenix writer mentioned that the only way  to achieve 
in Phoenix that was  to use SPLIT ON substatement, which required that approach 
to rewrite the HBase presplitting strategy. Perhaps [~gtCarrera9] might be able 
to speak to this better. 

bq. I'd encourage you to use the Phoenix serialization format (through 
PDataType and derived classes) to ensure you can do adhoc querying on the data
Okay, thanks, I will check that out. We are working on a whole set of 
enhancements for the base writer as well and I will look at this. 

bq. The most important aspect is how your row key is written and the separators 
you use if you're storing multiple values in the row key.
You’ve hit the nail on the head. We do have multiple values with different 
datatypes in row key as well as in column names with and without prefixes, so 
we have different datatypes and bunch of separators. [~jrottinghuis] has been 
addressing these points in YARN-3706 , for e.g. dealing with storing and 
parsing byte representations of separators. 

The timeline service schema has more tables and we are considering storing 
aggregated values in these Phoenix based tables (current thinking is to have 
them populated via co-processors watching the basic entity table).  Thanks for 
suggesting defining views on Phoenix tables, I will look up more details on 
that. 

Thanks once again,
Vrushali

> YARN Timeline Service: Next generation
> --------------------------------------
>
>                 Key: YARN-2928
>                 URL: https://issues.apache.org/jira/browse/YARN-2928
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>            Priority: Critical
>         Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
> v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx, 
> TimelineServiceStoragePerformanceTestSummaryYARN-2928.pdf
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to