[ 
https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14650069#comment-14650069
 ] 

Zhijie Shen commented on YARN-3904:
-----------------------------------

bq. I'm not 100% sure if that's what we would like to do. Maybe we would like 
to decouple the offline aggregation module from our normal entity storage. 
Therefore, maybe it's also appealing to allow users specify if they need to 
create data schema in the offline aggregation process? Such as, setting one 
flag in the offline aggregator to create data schema?

Make sense, but can we still make table creation centralized? I think we can 
make some option to create raw entity tables and aggregation tables separately. 
Thoughts?

bq. After the changes in this JIRA, we will only have two types of 
TimelineWriters, one for FS (test only) and one for HBase. The setting on the 
offline storage should be independent from this setting, I assume?

Yeah, I meant we currently have TIMELINE_SERVICE_READER|WRITER_CLASS pointing 
to a specific reader/writer implementation. However, it's better to have config 
such as "blah.blah.backend.type". When backend.type = hbase, we user can access 
HBase both directly and via Phoenix, and we allow aggregation. This may not 
need to part of this jira, but just think it out loudly.

> Refactor timelineservice.storage to add support to online and offline 
> aggregation writers
> -----------------------------------------------------------------------------------------
>
>                 Key: YARN-3904
>                 URL: https://issues.apache.org/jira/browse/YARN-3904
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Li Lu
>            Assignee: Li Lu
>         Attachments: YARN-3904-YARN-2928.001.patch, 
> YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, 
> YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, 
> YARN-3904-YARN-2928.006.patch
>
>
> After we finished the design for time-based aggregation, we can adopt our 
> existing Phoenix storage into the storage of the aggregated data. In this 
> JIRA, I'm proposing to refactor writers to add support to aggregation 
> writers. Offline aggregation writers typically has less contextual 
> information. We can distinguish these writers by special naming. We can also 
> use CollectorContexts to model all contextual information and use it in our 
> writer interfaces. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to