[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14575059#comment-14575059 ]
Vrushali C commented on YARN-2928: ---------------------------------- Hi [~jamestaylor] Thank you for taking the time to look through the write up and for filing PHOENIX-2028. In the context of pre-splits, yes, we wanted to have both writers write to tables that were pre-split with the same presplit strategy. However, I believe the folks working on the Phoenix writer mentioned that the only way to achieve in Phoenix that was to use SPLIT ON substatement, which required that approach to rewrite the HBase presplitting strategy. Perhaps [~gtCarrera9] might be able to speak to this better. bq. I'd encourage you to use the Phoenix serialization format (through PDataType and derived classes) to ensure you can do adhoc querying on the data Okay, thanks, I will check that out. We are working on a whole set of enhancements for the base writer as well and I will look at this. bq. The most important aspect is how your row key is written and the separators you use if you're storing multiple values in the row key. You’ve hit the nail on the head. We do have multiple values with different datatypes in row key as well as in column names with and without prefixes, so we have different datatypes and bunch of separators. [~jrottinghuis] has been addressing these points in YARN-3706 , for e.g. dealing with storing and parsing byte representations of separators. The timeline service schema has more tables and we are considering storing aggregated values in these Phoenix based tables (current thinking is to have them populated via co-processors watching the basic entity table). Thanks for suggesting defining views on Phoenix tables, I will look up more details on that. Thanks once again, Vrushali > YARN Timeline Service: Next generation > -------------------------------------- > > Key: YARN-2928 > URL: https://issues.apache.org/jira/browse/YARN-2928 > Project: Hadoop YARN > Issue Type: New Feature > Components: timelineserver > Reporter: Sangjin Lee > Assignee: Sangjin Lee > Priority: Critical > Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal > v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx, > TimelineServiceStoragePerformanceTestSummaryYARN-2928.pdf > > > We have the application timeline server implemented in yarn per YARN-1530 and > YARN-321. Although it is a great feature, we have recognized several critical > issues and features that need to be addressed. > This JIRA proposes the design and implementation changes to address those. > This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)