Benedikt Meurer wrote:
Jamie McCracken wrote:

While such an API is simple and works for some apps its certainly not
gonna be usable for applications like file managers or indexers. Calling
a separate out of process RPC for each file while loading a directory
would totally kill performance.

If it were needed to be called on each file while loading a directory
(which would only be for limited things like the mime type) then we
could also have an API to return a specific metadata for each file in a
directory in one shot.

EG

GetMimeTypesForFilesInFolder
 input DBUS_TYPE_STRING s (the folder uri)
 output DBUS_TYPE_DICT  a{ss} (the metadata as filename, mimetype)

Most of the other metadata would be retrieved on demand by users
requesting to see additional metadata for a file so the previous API
should suffice for that.


This would still be a performance problem for fast file managers, and it
would cause unnecessary load on the metadata implementation. Think of a
medium-size folder (around 1000 files). When the file manager enters the
directory it can display up to 50 files at once, and so it doesn't need
to know the metadata for the other 950 files until the user scrolls down
to the last file (slow scrolling in this case, so every file's view
item/row receives an expose event). Nevertheless, the "metadata daemon"
would need to fetch the data for all 1000 files and transfer them, even
tho the file manager needs only 5% of them.

shouldn't be that bad for btree based databases as they burst read the file's metadata anyhow. For retrieval of any metadata it can be asynchronous so it shouldn't slow anything down at all.

But what metadata would you need when loading a directory? The only one I can think of is MimeType perhaps which is currently async in Nautilus.

I figured that metadata as a whole would only be retrieved with a "properties" dialog (as it is in the case of the current Nautilus) so theres no need to pull down all metadata for all files when reading in a directory.


What's required from a file managers POV is a fast way to lookup the
meta data available for a certain URI w/o much overhead (e.g. w/o any
RPCs). Perhaps an mmap()able file or an SQLite database. Or - for the
brave - store it in the extattrs of the file (tho this is probably not
the way to go as some file systems limit the size of the data stored
within the extended attributes of a file).

The problem is then how do you take advantage of existing frameworks under developemnt like Beagle, Kat and Tenor. They are the ones that are producing the metadata in the first place and they each store it in their own databases (Beagle uses lucene's DB, Kat Sqlite3, Tenor Postgres and my own implementation of a metadata framework will use the embedded mysql lib) so its kind of difficult to standardise it in-process plus you also have a potentially huge dependency list which no platform would accept in its core. Unfortunately I cant see any other way round this but IPC.


Benedikt




--
Mr Jamie McCracken
http://www.advogato.org/person/jamiemcc/
_______________________________________________
xdg mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/xdg

Reply via email to