[ https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Li Lu updated YARN-3134: ------------------------ Attachment: YARN-3134-042715.patch In this patch I addressed most of the review comments. Major changes include: # Storing relationships between entities # Addressing per-thread connection caching and close problems by using Guava loading cache # Refactor collectors to allow writer injections upon creation # Disable autocommit explicitly # Using StringBuilders to replace StringBuffers The only two points I've not addressed are: # Make {{CONN_STRING}} configurable: I still think this is a little bit early for the POC stage. Adding this support is quick, though. # We need to redesign the way to store metrics for Phoenix. Explicit using HBase timestamp (PHOENIX-914) is not available in trunk. Meanwhile, YARN-3551 is changing the object model and removing startTime and endTime that are part of the primary key for Phoenix. I've marked YARN-3551 as blocking this JIRA and will follow up. > [Storage implementation] Exploiting the option of using Phoenix to access > HBase backend > --------------------------------------------------------------------------------------- > > Key: YARN-3134 > URL: https://issues.apache.org/jira/browse/YARN-3134 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Reporter: Zhijie Shen > Assignee: Li Lu > Attachments: YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, > YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, > YARN-3134DataSchema.pdf > > > Quote the introduction on Phoenix web page: > {code} > Apache Phoenix is a relational database layer over HBase delivered as a > client-embedded JDBC driver targeting low latency queries over HBase data. > Apache Phoenix takes your SQL query, compiles it into a series of HBase > scans, and orchestrates the running of those scans to produce regular JDBC > result sets. The table metadata is stored in an HBase table and versioned, > such that snapshot queries over prior versions will automatically use the > correct schema. Direct use of the HBase API, along with coprocessors and > custom filters, results in performance on the order of milliseconds for small > queries, or seconds for tens of millions of rows. > {code} > It may simply our implementation read/write data from/to HBase, and can > easily build index and compose complex query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)