Erwin Tam commented on ZOOKEEPER-464:
There was an intermittent problem with the ledger delete junit tests prior to
the last patch uploaded (which resolved it). I'll document the bug and the fix
for that here.
The ledger delete junit tests were failing intermittently and it was related to
an issue I saw earlier when I was running unit tests with a very small entry
log limit size (2K). When the entry logs roll over, we create a new one by
first writing the "BKLO" 1024 byte header to the beginning of the file. The
problem is, this byte buffer object is statically defined. In our junit tests,
we have multiple Bookie servers (and thus EntryLogger instances) in the same
jvm. If more than one EntryLogger is rolling over its current log and writing
the next one, they are accessing the same entryLog file header buffer. This
creates problems since the static header isn't accessed in a synchronized way.
This header byte buffer is cleared first before writing it to the log file.
Since it is static, one thread could clear it first, then another thread (from
a second Bookie server) clears it at the same time. The first thread writes the
header but when it is done, the header's byte buffer's internal pointers have
it pointing to the end and aren't reset. The second thread will then be
reading the header buffer that has not been cleared/reset. What ends up
happening is the entry logs in the second Bookie are created without the
header. When we're reading through those files later on to figure out which
ledgers make it up, it'll read incorrect values and try to allocate byte
buffers based on an incorrect length segment (basically reading in junk random
bytes). This creates the java heap space error.
The fix is simple and is to just make this logfile header a non-static
variable, initializing it in the EntryLogger constructor. In practice, we
shouldn't be running multiple Bookies within the same jvm so we wouldn't run
into this problem.
> Need procedure to garbage collect ledgers
> Key: ZOOKEEPER-464
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
> Project: Zookeeper
> Issue Type: New Feature
> Components: contrib-bookkeeper
> Reporter: Flavio Paiva Junqueira
> Assignee: Erwin Tam
> Fix For: 3.4.0
> Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch,
> ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
> An application using BookKeeper is likely to use a large number of ledgers
> over time. Such an application might not need all ledgers created over time
> and might want to delete some of these ledgers to free up some space on
> bookies. The idea of this jira is to implement a procedure that enables an
> application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on
> ZooKeeper, and delete the ledger data on corresponding bookies.
This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.