[ https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540220#comment-14540220 ]
Junping Du commented on YARN-3411: ---------------------------------- Also comments on latest (v5) patch: {code} +public class CreateSchema { {code} Can we rename it to a more concrete name, something like: TimelineSchemaCreator? {code} + private static int createTimelineEntityTable() { + try { + Configuration config = HBaseConfiguration.create(); + // add the hbase configuration details from classpath + config.addResource("hbase-site.xml"); + Connection conn = ConnectionFactory.createConnection(config); + Admin admin = conn.getAdmin(); ... {code} All of these code should be reused by create other tables. May be we should move it out of createTimelineEntityTable() and make it as static part of Class? {code} + if (admin.tableExists(table)) { + // do not disable / delete existing table + // similar to the approach taken by map-reduce jobs when + // output directory exists + LOG.error("Table " + table.getNameAsString() + " already exists."); + return 1; + } {code} We would like to throw exception here so user can get notified the failed reason immediately? {code} + // TTL is 30 days, need to make it configurable perhaps + cf3.setTimeToLive(2592000); {code} We shouldn't have a hard code value here. At least, add a "TODO" in comment to fix it later. In HBaseTimelineWriterImpl.java, {code} + // TODO right now using a default table name + // change later to use a config driven table name + entityTableName = TableName + .valueOf(EntityTableDetails.DEFAULT_ENTITY_TABLE_NAME); {code} Shall we consistent with table name of Pheonix writer if haven't make it configurable? Or we intent to do so for some reasons? {code} + if (entityPuts.size() > 0) { + LOG.info("Storing " + entityPuts.size() + " to " + + this.entityTableName.getNameAsString()); + entityTable.put(entityPuts); + } else { + LOG.warn("empty entity object?"); + } {code} The first log should be DEBUG level and wrap with if block of "LOG.isDebugEnabled()" which help performance. > [Storage implementation] explore the native HBase write schema for storage > -------------------------------------------------------------------------- > > Key: YARN-3411 > URL: https://issues.apache.org/jira/browse/YARN-3411 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Reporter: Sangjin Lee > Assignee: Vrushali C > Priority: Critical > Attachments: ATSv2BackendHBaseSchemaproposal.pdf, > YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, > YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.txt > > > There is work that's in progress to implement the storage based on a Phoenix > schema (YARN-3134). > In parallel, we would like to explore an implementation based on a native > HBase schema for the write path. Such a schema does not exclude using > Phoenix, especially for reads and offline queries. > Once we have basic implementations of both options, we could evaluate them in > terms of performance, scalability, usability, etc. and make a call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)