Vrushali C commented on YARN-3134:

Thanks [~gtCarrera9]! 

bq. About posting metrics, I was thinking if it's possible to allow users just 
send the delta to storage, and we can use some information in the timeline 
entity to infer if the entity itself is already in the entity table? If that's 
possible then we can have some shortcut (not touching entity table) for faster 
metrics updating, which may generate the majority of our storage traffic.

Yes, we do need a way to update a single metric for an entity (regardless of 
other implementation aspects like which table or if it's native hbase/phoenix 
etc). We had as part of the initial proposal for the TimelineWriter interface 
YARN-3031 but then as per review suggestions on that one, decided to add it in 
later. I think we do need a writer interface for writing/updating a single 

Also [~gtCarrera9] I think in this patch, you haven't yet included the entity 
events? I think I recollect a discussion that we should be adding in only some 
lifecycle events but not all, but in any case, the TimelineWriter 
Implementation does need to write events to the backend. I think we may want an 
API that writes a single event to the backend as well. 

> [Storage implementation] Exploiting the option of using Phoenix to access 
> HBase backend
> ---------------------------------------------------------------------------------------
>                 Key: YARN-3134
>                 URL: https://issues.apache.org/jira/browse/YARN-3134
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Zhijie Shen
>            Assignee: Li Lu
>         Attachments: YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, 
> YARN-3134DataSchema.pdf
> Quote the introduction on Phoenix web page:
> {code}
> Apache Phoenix is a relational database layer over HBase delivered as a 
> client-embedded JDBC driver targeting low latency queries over HBase data. 
> Apache Phoenix takes your SQL query, compiles it into a series of HBase 
> scans, and orchestrates the running of those scans to produce regular JDBC 
> result sets. The table metadata is stored in an HBase table and versioned, 
> such that snapshot queries over prior versions will automatically use the 
> correct schema. Direct use of the HBase API, along with coprocessors and 
> custom filters, results in performance on the order of milliseconds for small 
> queries, or seconds for tens of millions of rows.
> {code}
> It may simply our implementation read/write data from/to HBase, and can 
> easily build index and compose complex query.

This message was sent by Atlassian JIRA

Reply via email to