Re: [zfs-discuss] Lots of metadata overhead on filesystems with 100M files

Eric D. Mudama Wed, 17 Jun 2009 09:03:32 -0700

On Wed, Jun 17 at 13:49, Alan Hargreaves wrote:

Another question worth asking here is, is a find over the entirefilesystem something that they would expect to be executed withsufficient regularity that it the execution time would have a businessimpact.


Exactly.  That's such an odd business workload on 250,000,000 files
that there isn't likely to be much of a shortcut other than just
throwing tons of spindles (or SSDs) at the problem, and/or having tons
of memory.

If the finds are just by name, thats easy for the system to cache, but
if you're expecting to run something against the output of find with
-exec to parse/process 250M files on a regular basis, you'll likely be
severely IO bound.  Almost to the point of arguing for something like
Hadoop or another form of distributed map:reduce on your dataset with
a lot of nodes, instead of a single storage server.


--
Eric D. Mudama
edmud...@mail.bounceswoosh.org

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Lots of metadata overhead on filesystems with 100M files

Reply via email to