[ 
https://issues.apache.org/jira/browse/YARN-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14530911#comment-14530911
 ] 

Zhijie Shen commented on YARN-3448:
-----------------------------------

bq. This is a new dependency introduced for serializing/deserializing 
otherInfos and primaryValues into the db.

I'm curious about the reason why we choose it over GenericObjectMapper, which 
we used to ser/des those fields in leveldb timeline store.

bq. Can you clarify this comment? I can see this making sense in the 
serviceStart phase.

It's not something critical, but I think it's more elegant the service and its 
back ground thread don't start running until we call start of it.

bq. Can you make a recommendation here? I can see this being an ever increasing 
list of errors that client are free to log.

I just thought it out loudly. This is the new error code generated by the new 
store implementation. It should affect the existing usage. If users want to 
move on with the new implementation, it is probably reasonable to  have them 
handle the new error code. One problem is that YARN-3539 is targeting 2.7.1. 
Once that jira gets committed, we need to update the documentation again to 
declare this error code too (perhaps also mentioning this implementation). 



> Add Rolling Time To Lives Level DB Plugin Capabilities
> ------------------------------------------------------
>
>                 Key: YARN-3448
>                 URL: https://issues.apache.org/jira/browse/YARN-3448
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>              Labels: BB2015-05-TBR
>         Attachments: YARN-3448.1.patch, YARN-3448.10.patch, 
> YARN-3448.12.patch, YARN-3448.13.patch, YARN-3448.14.patch, 
> YARN-3448.15.patch, YARN-3448.16.patch, YARN-3448.2.patch, YARN-3448.3.patch, 
> YARN-3448.4.patch, YARN-3448.5.patch, YARN-3448.7.patch, YARN-3448.8.patch, 
> YARN-3448.9.patch
>
>
> For large applications, the majority of the time in LeveldbTimelineStore is 
> spent deleting old entities record at a time. An exclusive write lock is held 
> during the entire deletion phase which in practice can be hours. If we are to 
> relax some of the consistency constraints, other performance enhancing 
> techniques can be employed to maximize the throughput and minimize locking 
> time.
> Split the 5 sections of the leveldb database (domain, owner, start time, 
> entity, index) into 5 separate databases. This allows each database to 
> maximize the read cache effectiveness based on the unique usage patterns of 
> each database. With 5 separate databases each lookup is much faster. This can 
> also help with I/O to have the entity and index databases on separate disks.
> Rolling DBs for entity and index DBs. 99.9% of the data are in these two 
> sections 4:1 ration (index to entity) at least for tez. We replace DB record 
> removal with file system removal if we create a rolling set of databases that 
> age out and can be efficiently removed. To do this we must place a constraint 
> to always place an entity's events into it's correct rolling db instance 
> based on start time. This allows us to stitching the data back together while 
> reading and artificial paging.
> Relax the synchronous writes constraints. If we are willing to accept losing 
> some records that we not flushed in the operating system during a crash, we 
> can use async writes that can be much faster.
> Prefer Sequential writes. sequential writes can be several times faster than 
> random writes. Spend some small effort arranging the writes in such a way 
> that will trend towards sequential write performance over random write 
> performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to