Do not underestimate the performance tuning of database engines. And do not
overestimate the performance tuning of filesystems. In an image retrieval
system I worked on database BLOB storage was faster. Filesystems have their own
overhead in locating the file by name and retrieving the blocks, whereas
databases already have a file open and only need to seek to the offset which
should all be indexed accordingly. When you "select" from a database using
proper keys and indexes, this is very, very fast. When you open a file, it has
to slog through several directories worth of metadata to get the file metadata
and then read the file. What's more is where indexes are designed to be kept in
memory, your file tables are all MRU cached, and share look-ups/cache space
across the entire VFS. The database also buys you atomicity, which is
important during long look-ups. In the end, disk reads should be substantially
higher in the filesystem.
Then consider directory entry limits. These vary by FS. Old NT has a 64k files
per directory limit. (removed in later versions) To balance the lookups, you
need to resport to using GUIDs, and groups of characters as subdirectory names
(e.g. 4e4e-8f9e7dce-xyz should be stored as "\4e4e\8fe9\7dce\4e4e-8f9e7dce-xyz"
to keep the files per directory down. )
If you find a filesystem is faster, I suggest you examine your database data
definition. You can't get any faster than INDEX-SEEK-READ, which is what the
database does.
________________________________
From: Mohammed Rashad <mohammedrasha...@gmail.com>
To: witty-interest@lists.sourceforge.net
Sent: Sunday, October 7, 2012 9:44 AM
Subject: [Wt-interest] Wt File IO vs Database IO
All,
Due to large data used in a crowd source mapping project. I had decided to
completely eliminate the use of database and use Flat files for storage,
retrieval and query.
I thought of storing each record in db as individual files. so this will help
in the retrieval speed and no search is needed in the entire db or a file.
but if a table have more than 10000's of records and users accessing
(same/different) records from different places will result in N number of File
I/O
Will this be a bottleneck in the application. consider each file of size <=
15KB.?
The main reason to eliminate db is because of performance bottleneck in
database I/O.
So moving to new model will help in anyway as the number of users and data will
be much more than expected?
--
Regards,
Rashad
------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
witty-interest mailing list
witty-interest@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/witty-interest
------------------------------------------------------------------------------
The Windows 8 Center
In partnership with Sourceforge
Your idea - your app - 30 days. Get started!
http://windows8center.sourceforge.net/
what-html-developers-need-to-know-about-coding-windows-8-metro-style-apps/
_______________________________________________
witty-interest mailing list
witty-interest@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/witty-interest