Re: [zfs-discuss] ZFS and databases

Franz Haberhauer Mon, 15 May 2006 09:45:32 -0700

The problem I see with "sequential access jump all over the place" isthat this increases the utilization of the disks -over the years disks have become even faster for sequential access,whereas random access (as they haveto move the actuator) has not improved at the same pace - this is whatZFS exploits when writing.With its fancy detection of sequential access patterns and improvedreadahead, ZFS should be able todeal with the latency aspect of randomized read accesses - but at theexpense of a higher disk utilization.If you think of many processes accessing the same disks this may resultin disks "running out of IOPS"earlier than in an environment with sequential accesses (thoughcontiguos data).Obviously this heavily depends on the workload - but with the trendtowards even higher capacity disks,IOPS become a valuable resource and it may be worth to think about howto most efficiently use disks -a "self-optimizing" mechanism that in the background or on requestrearranges files to become contigous

may therefore be useful.

- Franz


Gregory Shaw wrote:

Rich, correct me if I'm wrong, but here's the scenario I was thinkingof:
- A large file is created.
- Over time, the file grows and shrinks.
The anticipated layout on disk due to this is that extents areallocated as the file changes. The extents may or may not be onmultiple spindles.
I envision a fragmentation over time that will cause sequentialaccess to jump all over the place. If you use smart controllers ordisks with read caching, their use of stripes and read-ahead (ifenabled) could cause performance to be bad.
So, my thought was to de-fragment the file to make it more contiguousand to allow hardware read-ahead to be effective.
An additional benefit would be to spread it across multiple spindlesin a contiguous fashion, such as:
disk0: 32mb
disk1: 32mb
disk2: 32mb
... etc.
Perhaps this is unnecessary. I'm simply trying to grasp the longterm performance implications of COW data.
On May 15, 2006, at 8:47 AM, Roch Bourbonnais - PerformanceEngineering wrote:
Gregory Shaw writes:
I really like the below idea:
    - the ability to defragment a file 'live'.

    I can see instances where that could be very useful.  For instance,
if you have multiple LUNs (or spindles, whatever) using ZFS, you
could re-optimize large files to spread the chunks across as many
spindles as possible.  Or, as stated below, make it contiguous.
    I don't know if that is automatic with ZFS today, but it's an idea.
I think the expected benefits of making it contiguous is
rooted in the belief that bigger I/Os is the only way to
reach top performance.

I think that before ZFS, both physical and logical
contiguity was required to enable sufficient readahead and
get performance.

Once  we  have  good  readahead based  on   detected logical
contiguous   accesses, It may  well  be  possible to get top
device speed through reasonably-sized I/O concurrency.

-r
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive ULVL4-382           [EMAIL PROTECTED] (work)
Louisville, CO 80028-4382                 [EMAIL PROTECTED] (home)
"When Microsoft writes an application for Linux, I've Won." - LinusTorvalds


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS and databases

Reply via email to