[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

Junping Du (JIRA) Thu, 21 May 2015 06:26:09 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14554268#comment-14554268
 ]


Junping Du commented on YARN-3411:
----------------------------------

Beside the major comment above, some minor comments over the latest (007) patch 
- some of them are NITs (optional to address here or a follow up JIRA):
{code}
  public HBaseTimelineWriterImpl(Configuration conf) throws IOException {
    super(conf.get("yarn.application.id",
        HBaseTimelineWriterImpl.class.getName()));
  }
{code}
I don't quite understand this. Do we try to add a new configuration 
"yarn.application.id" here? It doesn't sounds right and 
HBaseTimelineWriterImpl.class.getName() doesn't sounds like a default value for 
this configuration. Am I missing anything?

{code}
+enum EntityColumnDetails {
+  ID(EntityColumnFamily.INFO, "id"),
+  TYPE(EntityColumnFamily.INFO, "type"),
+  CREATED_TIME(EntityColumnFamily.INFO, "created_time"),
+  MODIFIED_TIME(EntityColumnFamily.INFO, "modified_time"),
+  FLOW_VERSION(EntityColumnFamily.INFO, "flow_version"),
+  PREFIX_IS_RELATED_TO(EntityColumnFamily.INFO, "r"),
+  PREFIX_RELATES_TO(EntityColumnFamily.INFO, "s"),
+  PREFIX_EVENTS(EntityColumnFamily.INFO, "e");
{code}
Given all columns here belongs to INFO C_F, we can omit the parameter of 
EntityColumnFamily.INFO. Also, rename EntityColumnDetails to 
EntityInfoColumnFamilyDetails could sound more clear.

{code}
+  private EntityColumnDetails(EntityColumnFamily columnFamily, 
+      String value) {
+    this.columnFamily = columnFamily;
+    this.value = value;
+    this.inBytes = Bytes.toBytes(this.value.toLowerCase());
+  }
{code}
This is a private constructor and all callers are under control to make sure 
value is already lower case. So toLowerCase() is not necessary (the same for 
EntityColumnFamily).

Other looks fine.

> [Storage implementation] explore the native HBase write schema for storage
> --------------------------------------------------------------------------
>
>                 Key: YARN-3411
>                 URL: https://issues.apache.org/jira/browse/YARN-3411
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Vrushali C
>            Priority: Critical
>         Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
> YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
> YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
> YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
> YARN-3411-YARN-2928.007.patch, YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, 
> YARN-3411.poc.4.txt, YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, 
> YARN-3411.poc.7.txt, YARN-3411.poc.txt
>
>
> There is work that's in progress to implement the storage based on a Phoenix 
> schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native 
> HBase schema for the write path. Such a schema does not exclude using 
> Phoenix, especially for reads and offline queries.
> Once we have basic implementations of both options, we could evaluate them in 
> terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

Reply via email to