[ 
https://issues.apache.org/jira/browse/YARN-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13914665#comment-13914665
 ] 

Billie Rinaldi commented on YARN-1730:
--------------------------------------

bq. Why do we need separate start-time caches for read and write calls?
The write cache is essential for having a good write throughput, so you don't 
have to hit disk to do a lookup each time you do a write.  The size of the 
write cache should be the maximum number of active entities (entities that are 
still receiving writes).  This may vary, so [~zjshen] suggested making it 
configurable.

The read cache is just there to improve read performance, so you don't have to 
hit disk twice per entity.  This cache would have the most recently queried 
entities instead of the most recently written entities. I add things to the 
read cache and the write cache when writing because more recently written 
things are also generally more likely to be queried.  The useful size of this 
cache can be as big as you want.

bq. Shouldn't we do the write-locking at the put(..) API level itself and not 
just creating the start-time? Or at-least when the actual write happens for a 
given entity?
It's a good question; we could decide to do this.  Locking when determining the 
start time is essential so that two writes for the same entity can't come up 
with different start times.  The writes to leveldb in a put are atomic, so that 
part isn't an issue.  The question is whether we care about the following: 
writes 1 and 2 come in, write 1 sets the start time for an entity, 2 uses that 
start time, but 2's put completes before 1's.

> Leveldb timeline store needs simple write locking
> -------------------------------------------------
>
>                 Key: YARN-1730
>                 URL: https://issues.apache.org/jira/browse/YARN-1730
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Billie Rinaldi
>            Assignee: Billie Rinaldi
>         Attachments: YARN-1730.1.patch, YARN-1730.2.patch, YARN-1730.3.patch, 
> YARN-1730.4.patch, YARN-1730.5.patch
>
>
> The actual data writes are performed atomically in a batch, but a lock should 
> be held while identifying a start time for the entity, which precedes every 
> write.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to