[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

Joep Rottinghuis (JIRA) Thu, 21 May 2015 09:29:02 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14554606#comment-14554606
 ]


Joep Rottinghuis commented on YARN-3411:
----------------------------------------

With respect to discussion on auto-creating tables I strongly agree with 
Junping that schema creation should be a separate and discrete step.
I do think that this should be as simple to do as possible, but auto-creating 
tables and simply creating them on the fly if they don't exist might seem like 
a good thing to do to avoid initial friction, but will lead to very surprising 
results if any user ever has a problem with the classpath setup and/or 
configurations rolled to a cluster.
We operate a dozen or so production, ad-hoc and test clusters, some specific 
with only HBase, others without. The odds of passing a wrong config, or getting 
a config mixed up is significant. With auto-creation one could simply connect 
to the wrong HBase instance and then data would start flowing to the wrong 
cluster. I think it'd be better to see explicit failures in that case.

The error message should probably be crystal clear and read something like: ATS 
(class so-and-so) is trying to write to HBase cluster (so-and-so) and is 
missing required table (so-and-so).

In an earlier comment I suggested to have a config key to have a prefix for all 
table names so that people can easily switch all tables from one schema (test, 
experimentation) to another (staging, prod).

> [Storage implementation] explore the native HBase write schema for storage
> --------------------------------------------------------------------------
>
>                 Key: YARN-3411
>                 URL: https://issues.apache.org/jira/browse/YARN-3411
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Vrushali C
>            Priority: Critical
>         Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
> YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
> YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
> YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
> YARN-3411-YARN-2928.007.patch, YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, 
> YARN-3411.poc.4.txt, YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, 
> YARN-3411.poc.7.txt, YARN-3411.poc.txt
>
>
> There is work that's in progress to implement the storage based on a Phoenix 
> schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native 
> HBase schema for the write path. Such a schema does not exclude using 
> Phoenix, especially for reads and offline queries.
> Once we have basic implementations of both options, we could evaluate them in 
> terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

Reply via email to