2007/2/20, Jos van den Oever <[EMAIL PROTECTED]>:
2007/2/19, Mikkel Kamstrup Erlandsen <[EMAIL PROTECTED]>: > Let's get the ball rolling on the metadata spec. This first period will just > be *brainstorming*, so let's try and avoid the nitty gritty details for now. > > ** What we need: > > Fields) Metadata field names and descriptions for *desktop* objects > > Types) A type grouping of metadata fields to be used in user search > language. Example types could be "Email", "Image", "Audio", etc. > > API) A dbus api to get/set metadata > > ?Tag/Emblem) Tagging/Keywords/Emblems > > ** Starting points/References: > - Adobe XMP: > http://partners.adobe.com/public/developer/en/xmp/sdk/XMPspecification.pdf > > - Shared Metadata Spec: > http://freedesktop.org/wiki/Standards_2fshared_2dfilemetadata_2dspec > - Tracker metadata api: > http://svn.gnome.org/viewcvs/tracker/trunk/data/tracker-introspect.xml?view=markup > > - Spotlight Metadata Spec: > http://developer.apple.com/documentation/Carbon/Reference/MetadataAttributesRef/Reference/CommonAttrs.html > > - Shared Emblem Spec: > http://freedesktop.org/wiki/Standards_2fdesktop_2demblem_2dspec > - Others ideas? Nepomuk-specs? Beagle-specs? > > ** My thoughts: > Regarding Fields): To prevent death-by-1000-page-spec I suggest we keep the > field names to a core set of commonly used attributes. Ie not like Apples > spotlight spec (see above) which defines every known property in the > universe. When things move on, teams with expert knowledge can refine > extensions to this spec. Fx a Wasabi Photography Metadata spec could be > hashed out by people in the know (which could just be EXIF, but I'm not the > photography expert). > > Regarding Types): There are some suggestions in the top of the Tracker api > link above. Regarding these I think we should leave the VFS* types out, and > only use single-word type names (Ie no spaces). > > On the API): Obviously we getters and setters. They probably need to operate > on uris. There probably needs to be some search functionality in here too > since we probably shouldn't assume that the indexer and metadata server are > the same. > > Tagging/Emblems: If you ask me they should be "just another type of > metadata". When the metadata spec matures a bit we can evaluate if it needs > it's own api to make things easier (and allow for dedicated tagging > services). Hi All, First I'd like to point to the original mail I sent on this subject. It already contained a relatively simple spec framework. That is, not attribute names, but a way to define them, type them and check them. There was also some code attached to do allow testsets to check the correctness of metadata extraction from files. Hence the title of the mail: 'mimetype standardization by testsets'. I still stand by this idea.
Sorry Jos, how could I miss this out. For reference - here's the original thread: http://lists.freedesktop.org/archives/xdg/2006-October/008682.html Here is an idea for a simple proposal.
- Each metadata type is identified by a URI. E.g. http://www.freedesktop.org/metadata/xhtml1/title. - For each URI there will be human readable descriptions in every language and keywords in every language. I will use the keyword in the further description mixed with the URI.
I like this idea as such. I can't readily see how it intermixes with known widespread standards such as DC though..? - It also has a simple type: integer, string, float, binary. Or the
more elaborate list of the tracker spec or the xml schema simple types. Personally, I prefer the xml schema spec [1]. We'd need to support only a subset. - Each type has a maximal cardinality. This means how often a field may occur per file/object. For example the metadata 'size' should occur only once, but the metadata 'tag' may occur multiple times. - Each may have one parent type. Cardinality and type of the parent is inherited, but may be restricted. Having multiple parents is a bad idea I think. - Each type is embedded, not embedded, or unspecified. - Each type is derived or not derived. E.g. 'size' is derived, but 'title' is not. This means that 'title' is potentially writeable. Whether a metadata field is writeable depends on the implementation. Using 'embedded' and 'derived' instead of 'writeable' is clearer, because 'writeable' depends on a number of factors: can the software write the property, is the file writeable, can the database handle external metdata. Groups are defined separately from the types. They are simple lists of metadata type uris. All children of these URI's also fall into this group. Groups are also identified by a URI and they have translations in different languages for the user interfaces. They may also have a short keyword form. The metadata types in a group do not need to have the same cardinality or data type. What do you think?
This was somewhat close to the things I have been thinking about. I have to give this a bit more though when I get home... Cheers, Mikkel
_______________________________________________ xdg mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/xdg
