>>>>> "ah" == Al Hopper <a...@logical-approach.com> writes:

    ah> The main issue is that most flash devices support 128k byte
    ah> pages, and the smallest "chunk" (for want of a better word) of
    ah> flash memory that can be written is a page - or 128kb.  So if
    ah> you have a write to an SSD that only changes 1 byte in one 512
    ah> byte "disk" sector, the SSD controller has to either
    ah> read/re-write the affected page or figure out how to update
    ah> the flash memory with the minimum affect on flash wear.

yeah well, I'm not sure it matters, but that's untrue.

there are two sizes for NAND flash, the minimum write size and the
minimum erase size.  The minimum write size is the size over which
error correction is done, the unit at which inband and OOB data is
interleaved, on NAND flash.  The minimum erase size is just what it
sounds, the size the cleaner/garbagecolelctor must evacuate.

The minimum write size is I suppose likely to provoke
read/modify/write and wasting of write and wear bandwidth for smaller
writes in flashes which do not have a DRAM+supercap, if you ask to
SYNCHRONZIE CACHE right after the write.  If there is a supercap, or
if you allow teh drive to do write caching, then the smaller write
could be coalesced making this size irrelevant.  I think it's usually
2 - 4 kB.  I would expect resistance to growing it larger than 4kB
because of NTFS---electrical engineers are usually over-obsessed with
Windows.

The minimum erase size you don't really care about at all.  That's the
one that's usually at least 128kB.

    ah> For anyone who is interested in getting more details of the
    ah> challenges with flash memory, when used to build solid state
    ah> drives, reading the tech data sheets on the flash memory
    ah> devices will give you a feel for the basic issues that must be
    ah> solved.

and the linux-mtd list will give you a feel for how people are solving
them, because that's the only place I know of where NAND filesystem
work is going on in the open.  There are a bunch of geezers saying ``I
wrote one for BSD but my employer won't let me release it,'' and then
the new crop of intel/sandforce/stec proprietary kids, but in the open
world AFAIK there is just yaffs and ubifs.  The tmobile G1 is yaffs.

    ah> Bobs point is well made.  The specifics of a given SSD
    ah> implementation will make the performance characteristics of
    ah> the resulting SSD very difficult to predict or even describe -

I'm really a fan of thte idea of using ACARD ANS-9010 for a slog.
It's basically all DRAM+battery, and uses a low performance CF card
for durable storage if the battery starts to run low, or if you
explicitly request it (to move data between ACARD units by moving the
CF card maybe).  It will even make non-ECC RAM into ECC storage (using
a sector size and OOB data :).  It seems like Zeus-like performance at
1/10th the price, but of course it's a little goofy, and I've never
tried it.

slog is where I'd expect the high synchronous workload to be, so this
is where there are small writes that can't be coalesced, I would
presume, and appropriate slog sizes are reachable with DRAM alone.

Attachment: pgpvCrA05zqYv.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to