Vrushali C commented on YARN-3411:

Hi [~djp]

Thanks for the initial quick feedback! Some responses below:

bq. Do we need to make sure data in each row get updated consistently?
I was thinking it is not necessary since the entity information would come in a 
more streaming fashion, one update at a time anyways. If say, one column is 
written and other is not, the callee can retry again, hbase put will simply 
over-write existing value. 

bq. We shouldn't swallow exception in updating data to HBase, just log.error() 
may not be enough.
Okay, let me look through and modify that.

bq. We need to check null in writing TimelineEntity to HBase, as TimelineEntity 
could include null events/configurations/metrics, that could make foreach later 
throw NPE exception
I have added some null checks, I will go over the code again and update it to 
ensure I have null checks for entity class members like configurations, metrics 

> [Storage implementation] explore the native HBase write schema for storage
> --------------------------------------------------------------------------
>                 Key: YARN-3411
>                 URL: https://issues.apache.org/jira/browse/YARN-3411
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Vrushali C
>            Priority: Critical
>         Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
> YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
> YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
> YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
> YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
> YARN-3411.poc.txt
> There is work that's in progress to implement the storage based on a Phoenix 
> schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native 
> HBase schema for the write path. Such a schema does not exclude using 
> Phoenix, especially for reads and offline queries.
> Once we have basic implementations of both options, we could evaluate them in 
> terms of performance, scalability, usability, etc. and make a call.

This message was sent by Atlassian JIRA

Reply via email to