[jira] [Commented] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend

Junping Du (JIRA) Wed, 06 May 2015 09:55:33 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14530885#comment-14530885
 ]


Junping Du commented on YARN-3134:
----------------------------------

bq. I have not made the change on the Phoenix connection string since, 
according to our previous discussion, we're planning to address this after 
we've decided which implementation to pursue in the future. 
It should be fine to keep hard coded JDBC connection address here. At least, we 
should update the name and make it private instead of default visibility 
(package level) as no place outside of CLASS need to use it for now. 

Some other comments against latest patch:

{code}
+  /** Default Phoenix JDBC driver name */
+  static final String DRIVER_CLASS_NAME = 
"org.apache.phoenix.jdbc.PhoenixDriver";
+  /** Default Phoenix timeline entity table name */
+  static final String ENTITY_TABLE_NAME = "TIMELINE_ENTITY";
...
{code}
Are we going to share these constant variables with HBaseTimelineWriterImpl? If 
not, we should mark them as private. In fact, just quickly check patch in 
YARN-3411, I would expect we could reuse some of these constants but that not 
something we should worry about it now.

{code}
+  private static final String[] PHOENIX_STORAGE_PK_LIST
+      = {"cluster", "user", "flow", "version", "run", "appid", "type", 
"entityid"};
+  private static final String[] TIMELINE_EVENT_EXTRA_PK_LIST =
+      {"timestamp", "eventid"};
+  private static final String[] TIMELINE_METRIC_EXTRA_PK_LIST =
+      {"metricid"};
{code}
For key name, we should keep consistent with other places, i.e. replacing with 
cluster_id, user_id, flow_name, flow_version, etc. or we will have significant 
headache when maintain it in future. In addition, for key naming convention, it 
sounds like we don't follow any name conversions for now, e.g. lower/upper 
CamelCase or something else. From examples of Phonix 
project(http://phoenix.apache.org/views.html or 
http://phoenix.apache.org/multi-tenancy.html), they hire the style with having 
an underscore as a word separator. I would suggest to conform the same style 
here.

{code}
+    try {
+      conn.commit();
+    } catch (SQLException se) {
+      LOG.error("Failed to close the phoenix connection! "
+          + se.getLocalizedMessage());
{code}
Message is not correct as exception is not happen in closing connection, but 
commit.

Last but not the least, looks like we have many helper methods 
(appendColumnsSQL, setStringsForColumnFamily, setStringsForPrimaryKey, etc.) 
for constructing the SQL sentences. However, these helper methods are without 
any comments and lacking of unit tests to verify correctness in all cases.

> [Storage implementation] Exploiting the option of using Phoenix to access 
> HBase backend
> ---------------------------------------------------------------------------------------
>
>                 Key: YARN-3134
>                 URL: https://issues.apache.org/jira/browse/YARN-3134
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Zhijie Shen
>            Assignee: Li Lu
>              Labels: BB2015-05-TBR
>         Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, 
> YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, 
> YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, 
> YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, 
> YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, 
> YARN-3134DataSchema.pdf
>
>
> Quote the introduction on Phoenix web page:
> {code}
> Apache Phoenix is a relational database layer over HBase delivered as a 
> client-embedded JDBC driver targeting low latency queries over HBase data. 
> Apache Phoenix takes your SQL query, compiles it into a series of HBase 
> scans, and orchestrates the running of those scans to produce regular JDBC 
> result sets. The table metadata is stored in an HBase table and versioned, 
> such that snapshot queries over prior versions will automatically use the 
> correct schema. Direct use of the HBase API, along with coprocessors and 
> custom filters, results in performance on the order of milliseconds for small 
> queries, or seconds for tens of millions of rows.
> {code}
> It may simply our implementation read/write data from/to HBase, and can 
> easily build index and compose complex query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend

Reply via email to