Re: [Wasabi] Kicking of the Metadata spec - brainstorm

Mikkel Kamstrup Erlandsen Tue, 20 Feb 2007 06:02:19 -0800

2007/2/20, Jos van den Oever <[EMAIL PROTECTED]>:


2007/2/19, Mikkel Kamstrup Erlandsen <[EMAIL PROTECTED]>:
> Let's get the ball rolling on the metadata spec. This first period will
just
> be *brainstorming*, so let's try and avoid the nitty gritty details for
now.
>
>  ** What we need:
>
>   Fields)  Metadata field names and descriptions for *desktop* objects
>
>   Types) A type grouping of metadata fields to be used in user search
> language. Example types could be "Email", "Image", "Audio", etc.
>
>   API) A dbus api to get/set metadata
>
>   ?Tag/Emblem) Tagging/Keywords/Emblems
>
>  ** Starting points/References:
>  - Adobe XMP:
>
http://partners.adobe.com/public/developer/en/xmp/sdk/XMPspecification.pdf
>
>  - Shared Metadata Spec:
> http://freedesktop.org/wiki/Standards_2fshared_2dfilemetadata_2dspec
>  - Tracker metadata api:
>
http://svn.gnome.org/viewcvs/tracker/trunk/data/tracker-introspect.xml?view=markup
>
>  - Spotlight Metadata Spec:
>
http://developer.apple.com/documentation/Carbon/Reference/MetadataAttributesRef/Reference/CommonAttrs.html
>
>  - Shared Emblem Spec:
> http://freedesktop.org/wiki/Standards_2fdesktop_2demblem_2dspec
>  - Others ideas? Nepomuk-specs? Beagle-specs?
>
>  ** My thoughts:
> Regarding Fields): To prevent death-by-1000-page-spec I suggest we keep
the
> field names to a core set of commonly used attributes. Ie not like
Apples
> spotlight spec (see above) which defines every known property in the
> universe. When things move on, teams with expert knowledge can refine
> extensions to this spec. Fx a Wasabi Photography Metadata spec could be
> hashed out by people in the know (which could just be EXIF, but I'm not
the
> photography expert).
>
> Regarding Types): There are some suggestions in the top of the Tracker
api
> link above. Regarding these I think we should leave the VFS* types out,
and
> only use single-word type names (Ie no spaces).
>
> On the API): Obviously we getters and setters. They probably need to
operate
> on uris. There probably needs to be some search functionality in here
too
> since we probably shouldn't assume that the indexer and metadata server
are
> the same.
>
> Tagging/Emblems: If you ask me they should be "just another type of
> metadata". When the metadata spec matures a bit we can evaluate if it
needs
> it's own api to make things easier (and allow for dedicated tagging
> services).

Hi All,

First I'd like to point to the original mail I sent on this subject.
It already contained a relatively simple spec framework. That is, not
attribute names, but a way to define them, type them and check them.
There was also some code attached to do allow testsets to check the
correctness of metadata extraction from files. Hence the title of the
mail: 'mimetype standardization by testsets'. I still stand by this
idea.



Sorry Jos, how could I miss this out. For reference - here's the original
thread:
http://lists.freedesktop.org/archives/xdg/2006-October/008682.html


Here is an idea for a simple proposal.


- Each metadata type is identified by a URI. E.g.
http://www.freedesktop.org/metadata/xhtml1/title.
- For each URI there will be human readable descriptions in every
language and keywords in every language. I will use the keyword in the
further description mixed with the URI.



I like this idea as such. I can't readily see how it intermixes with known
widespread standards such as DC though..?


- It also has a simple type: integer, string, float, binary. Or the

more elaborate list of the tracker spec or the xml schema simple
types. Personally, I prefer the xml schema spec [1]. We'd need to
support only a subset.
- Each type has a maximal cardinality. This means how often a field
may occur per file/object. For example the metadata 'size' should
occur only once, but the metadata 'tag' may occur multiple times.
- Each may have one parent type. Cardinality and type of the parent is
inherited, but may be restricted. Having multiple parents is a bad
idea I think.
- Each type is embedded, not embedded, or unspecified.
- Each type is derived or not derived. E.g. 'size' is derived, but
'title' is not. This means that 'title' is potentially writeable.

Whether a metadata field is writeable depends on the implementation.
Using 'embedded' and 'derived' instead of 'writeable' is clearer,
because 'writeable' depends on a number of factors: can the software
write the property, is the file writeable, can the database handle
external metdata.

Groups are defined separately from the types. They are simple lists of
metadata type uris. All children of these URI's also fall into this
group. Groups are also identified by a URI and they have translations
in different languages for the user interfaces. They may also have a
short keyword form. The metadata types in a group do not need to have
the same cardinality or data type.

What do you think?




This was somewhat close to the things I have been thinking about. I have to
give this a bit more though when I get home...

Cheers,
Mikkel

_______________________________________________
xdg mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/xdg

Re: [Wasabi] Kicking of the Metadata spec - brainstorm

Reply via email to