Hi Satish,
 take a look at 
http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperAdmin.html#sc_mainte
nance

This can be run as a cron job and will get rid of old unwanted logs and
snapshots.

mahadev


On 4/24/09 10:18 AM, "Satish Bhatti" <cthd2...@gmail.com> wrote:

> A follow up to this:  I implemented method (b), and ran a test that
> generated 100K of ids.  This generated 1.3G worth of transaction logs.
>  Question:  when can these be safely deleted?  How does one know which ones
> may be deleted?  Or do they need to exist forever?
> 
> On Fri, Apr 24, 2009 at 9:52 AM, Ted Dunning <ted.dunn...@gmail.com> wrote:
> 
>> Of the methods proposed,
>> 
>> a) recursive sequential files
>> 
>> b) latest state file(s) that is updated using a pseudo transaction to give
>> a
>> range of numbers to allocate
>> 
>> c) just probe zxid
>> 
>> You should be pretty good with any of them.  With (a), you have to be
>> careful to avoid race conditions when you get to the end of the range for
>> the sub-level.  With (b), you get results of guaranteed nature although the
>> highest throughput versions might have gaps (shouldn't bother you).  The
>> code for this is more complex than the other implementations.  With (c),
>> you
>> could have potentially large gaps in the sequence, but 64 bits that
>> shouldn't be a big deal.  Code for that version would be the simplest of
>> any
>> of them.
>> 
>> On Fri, Apr 24, 2009 at 8:56 AM, Satish Bhatti <cthd2...@gmail.com> wrote:
>> 
>>> Hello Ben,
>>> Basically the ids are document Ids.  We will eventually have several
>>> billion
>>> documents in our system, and each has a unique long id.  Currently we are
>>> using a database sequence to generate these longs.  Having eliminated
>> other
>>> uses of the database, we didn't want to keep it around just to generate
>>> ids.
>>>  That is why I am looking to use ZooKeeper to generate them instead.
>>> 
>>> 
>> 

Reply via email to