[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14577885#comment-14577885 ]
Li Lu commented on YARN-2928: ----------------------------- Hi [~jamestaylor], thank you very much for your great help! Some clarifications on my questions... bq. For your configuration/metric key-value pair, how are they named? Do you know the possible set of key values in advance? Or are they known more-or-less on-the-fly? For our use case they're completely on-the-fly. For each timeline entity, we plan to store each of its configuration/metric in one dynamic column. It is possible that different entities may have completely different configs/metrics. For example, a mapreduce job may have a completely different set of configs to a tez job. Therefore, we need to generate all columns for configs/metrics dynamically. I'm wondering that, when adding the dynamic columns into a view, do I still need to explicitly claim those dynamic columns (I assume yes but would like to double check)? bq. Are you thinking to have a secondary table that's a rollup aggregation of more raw data? Is that required, or is it more of a convenience for the user? If the raw data is Phoenix-queryable, then I think you have a lot of options. Can you point me to some more info on your design? Yes, we are considering to have multiple levels of aggregation tables, each with a different granularity. For example, now we're planning to do the first level (application level) aggregation from an HBase table to a Phoenix table. Then, we can aggregate flow level information based on our application level aggregation (since each application belongs to and only belongs to one flow). In this way, we can temporarily get rid of the write throughput limitation of Phoenix, but still support SQL queries on aggregated data. If the Phoenix PDataTypes are stable, then is it possible for us to do the following two things? # Use HBase API and PDataTypes to read a Phoenix table, and read dynamic columns iteratively. # Use HBase API and PDataTypes to write a Phoenix table, and write dynamic columns iteratively. > YARN Timeline Service: Next generation > -------------------------------------- > > Key: YARN-2928 > URL: https://issues.apache.org/jira/browse/YARN-2928 > Project: Hadoop YARN > Issue Type: New Feature > Components: timelineserver > Reporter: Sangjin Lee > Assignee: Sangjin Lee > Priority: Critical > Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal > v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx, > TimelineServiceStoragePerformanceTestSummaryYARN-2928.pdf > > > We have the application timeline server implemented in yarn per YARN-1530 and > YARN-321. Although it is a great feature, we have recognized several critical > issues and features that need to be addressed. > This JIRA proposes the design and implementation changes to address those. > This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)