[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496538#comment-14496538
 ] 

Junping Du commented on YARN-3411:
----------------------------------

Thanks [~vrushalic] for delivering the proposal and poc patch which is an 
excellent job!
Some quick comments from walk through proposal:
bq. Entity Table - primary key components-putting the UserID first helps to 
distribute writes across the regions in the hbase cluster.  Pros:​ avoids 
single region hotspotting. Cons:​ connections would be open to several region 
servers during writes from per node ATS.
Looks like we are try to get rid of region server hotspotting issues. I agree 
that this design could helps. However, this is still possible that specific 
user could submit much more applications than anyone else. In that case, the 
region hotspot issue will still appear. Isn't it? I think the more general way 
to solve this problem is making keys get salted with a prefix. Thoughts?

bq. Entity Table - column families​-config needs to be stored as key value, not 
as a blob to enable efficient key based querying based on config param name. 
storing it in a separate column family helps to avoid scanning over config  
while reading metrics and vice versa
+1. This leverage strength of columnar database. We should get rid of storing 
any default value for key. However, this sounds challengable if TimelineClient 
only has a configuration object.

bq. Entity Table - metrics are written to with an hbase cell timestamp set to 
top of the minute or top of the 5 minute interval or whatever is decided. This 
helps in timeseries storage and retrieval in case of querying at the entity 
level.
Can we also let TimelineCollector do some aggregation of metrics in a similar 
time interval rather than sending to HBase/Pheonix for every metrics when it 
received? This may help to lease some pressure to backend.

bq. Flow by application id table
I am still think we should figure out some way to store application attempts 
info. The typical usecase here is: for some reason (like: bug or hardware 
capability reason), some flow/application's AM could always get failed more 
times than other flows/applications. Keeping this info can help us to track 
these issues. Isn't it?

bq. flow summary daily table (aggregation table managed by Phoenix) - could be 
triggered via co­processor with each put in flow table or a cron run once per 
day to aggregate for yesterday (with catchup functionality in case of backlog 
etc)
Do each put in flow table sounds a little expensive especially when putting 
activity is very frequently. May be we should do some batch mode here? In 
addition, I think we can leverage per node TimelineCollector to do some first 
level aggregation which can help to relieve workload in backend.

> [Storage implementation] explore the native HBase write schema for storage
> --------------------------------------------------------------------------
>
>                 Key: YARN-3411
>                 URL: https://issues.apache.org/jira/browse/YARN-3411
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Vrushali C
>            Priority: Critical
>         Attachments: ATSv2BackendHBaseSchemaproposal.pdf, YARN-3411.poc.txt
>
>
> There is work that's in progress to implement the storage based on a Phoenix 
> schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native 
> HBase schema for the write path. Such a schema does not exclude using 
> Phoenix, especially for reads and offline queries.
> Once we have basic implementations of both options, we could evaluate them in 
> terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to