[ https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14509951#comment-14509951 ]
Li Lu commented on YARN-3411: ----------------------------- Hi [~vrushalic], thanks for the patch! I'm OK with the major part of this patch for now. Here, I'm listing some questions that we can have some discussion on. # About null checks: so far we do not have a fixed standard on if and where we need to do null checks. I noticed you assumed info, config, event, and other similar fields are not null. Maybe we'd like to explicitly decide when all those fields can be null or empty. # Maybe we'd like to change TimelineWriterUtils to default access modifier? I think it would be sufficient to make it visible in package? # One thing I'd like to open a discussion is on deciding the way to store and process metrics. Currently, in the hbase patch, startTime and endTime are not used. In the Phoenix patch, I store time series as a flattened, non-queryable strings. I think this part also requires some hint from the time-based aggregations. # Another thing I'd like to discuss here is if and how we'd like to set up a separate "fast path" for metric only updates. On the storage layer, I'd strongly +1 for a separate fast path such that we can only touch the (frequently updated) metrics table. Any proposals everyone? > [Storage implementation] explore the native HBase write schema for storage > -------------------------------------------------------------------------- > > Key: YARN-3411 > URL: https://issues.apache.org/jira/browse/YARN-3411 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Reporter: Sangjin Lee > Assignee: Vrushali C > Priority: Critical > Attachments: ATSv2BackendHBaseSchemaproposal.pdf, > YARN-3411.poc.2.txt, YARN-3411.poc.txt > > > There is work that's in progress to implement the storage based on a Phoenix > schema (YARN-3134). > In parallel, we would like to explore an implementation based on a native > HBase schema for the write path. Such a schema does not exclude using > Phoenix, especially for reads and offline queries. > Once we have basic implementations of both options, we could evaluate them in > terms of performance, scalability, usability, etc. and make a call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)