Henri Sivonen wrote:
Going off-topic for public-html. -public-html +www-archive
I don't think it is off-topic, but anyway...
No, it doesn't even scale for them. For instance, in the HTML I
produce I could specific conventions (classnames, link relations,
whatever) to embed metadata. A generic transformer for HTML wouldn't
be able to handle that.
If you use conventions specific to your site, you are venturing outside
the well-known part. If you serve a program that transforms your
specific syntax to RDF, you move the point where a well-known vocabulary
is needed to the RDF layer, but concrete common ground with the
information consumer has to come somewhere. However, making the consumer
I'm not sure what you're talking about. The GRDDL transform might
extract information from custom markup patterns, but produce statements
using publicly known URIs.
run a foreign XSLT program has the scalability problem of crawlers being
able to execute programs in large quantities. With Validator.nu, the
main scalability program seems to be the ability to execute Schematron,
which is implemented by compiling the Schematron schema into XSLT and
running the XSLT program.
We discussed this already over on public-html. There are many ways to
publish RDF, some are:
- publish RDF in XML or N3...
- publish HTML with embedded RDFa
- publish HTML with microformats
- publish HTML with GRDDL
- do the latter, but run the transform on the server, essentially
publishing both HTML and RDF
...and so on.
No recipient is *forced* to run the GRDDL transformation.
Moreover, for class-based syntaxes, a transformer that contains its
executable parts (whether in XSLT or in another programming language)
only needs to cover the kind of syntax that a given application is
interested in consuming. If I'm looking for hCard data and my
application understands RDF vCard, I only need a transformation from
hCard to RDF vCard. I don't need a solution that scales to all
microformats.
Yes. So? As far as I can tell, nobody claimed that GRDDL necessarily is
the right solution to *every single* use case one can think of. If you
only look for address book data, you can still consume generic RDF and
just extract the properties you're interested in.
If you are serving a document in your vocabulary and a program that
makes sense of it, are you really communicating with others by
sending semantic markup or are you communicating by sending programs?
If you made your markup empty and embedded all the data in the
transformation program, would the recipient know any difference?
I don't see how that is relevant. What's relevant is what the
recipient gets. And of course the intent of GRDDL is to have a single
transform for a vocabulary, and to reuse that transform for each
instance document. You could use it in a different way, but who cares?
http://lists.w3.org/Archives/Public/www-tag/2008Jul/0164.html
That doesn't seem to be reply to the thing we were discussing, it's just
re-stating that forcing clients to run code to read information is bad.
Agreed.
The situation the quote on the TAG list is about is Flash, which is not
crawlable *at all* without running custom code (at least that's the
concern). This is not the case for HTML+GRDDL; the HTML code is
available as it should be.
If what you want is a way to embed generic RDF without having to serve
multiple documents, or by letting the client run code, RDFa seems to be
one potential solution.
BR, Julian