"Aryeh Gregor" <[email protected]> wrote in message 
news:[email protected]...
> On Mon, Jan 18, 2010 at 7:47 AM, Henri Sivonen <[email protected]> wrote:
>
> 1) Output a very few pieces of metadata that would be useful to HTML
> consumers, like license metadata.  For these, we should use microdata
> or RDFa, maybe just with one or two vocabularies whitelisted, and it
> would be simplest to just let people type it into templates via
> wikitext.  I'm pretty certain about this.

Eh?  I get the feeling that we're reading from totally different song sheets 
here.  You seem to be saying here is that you expect the use case to be 
'license templates on steroids': on the image description page, we have 
license templates that now emit 
microdata/RDF/the-metadata-format-of-the-month, which can be picked up by 
whoever is interested.  That's not MediaWiki doing anything active with the 
data, and it's absolutely no different from marking up infoboxes.  In fact, 
the usecase for infoboxes is arguably stronger, because their data structure 
is more complicated and harder to machine-read otherwise.

What I had assumed we meant by "MediaWiki do stuff with metadata" would be 
to pick up metadata about an image, and then output that **wherever the 
image is used**.  So when you view an article with an image, that use of the 
image has a metadata cloud that describes where the image is from, what its 
license is, whatever.  Information that, for an external image, might not be 
available via JavaScript or other means.  I see things like the 
"put-a-red-border-round-fair-use-images" script I have in my monobook being 
implemented just by picking out that metadata, and without having to run 
stacks of api queries.

That usecase is incredibly badly served by just allowing raw metadata in the 
image page wikitext; it's really no different to adding categories via a 
license template.  MediaWiki needs to have that metadata stored separately 
from wikitext, or at least entered via wikitext in a parser-friendly way: 
the customary way for the parser to pick 'stuff' out of wikitext is with 
parser functions, magic words, link syntax, whatever.

> We can always add new input formats or switch the output format later
> if we have good reason, though.  Especially if we keep input
> restricted to one or two vocabularies -- or three, which for microdata
> is all of them right now.  :)

Again, I don't know which side of the coin you're talking about: switching 
the output format is trivial *iff* there's a disjoint between the input and 
output.  If MW is extracting its metadata by reading [format] out of 
wikitext, then *adding* new formats becomes a PITA, and *removing* formats 
becomes impossible.  So much better to have a format-independent input 
system for extracting metadata, and then be able to implement any of a range 
of outputs as dictated by the times.

--HM
 



_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to