RDF was supposed to address these things. I think it's the right approach, because you need a general model for metadata, multiple ways to express it (in XML, in HTTP headers, etc.), and a way to combine metadata from multiple sources to come up with a definitive view of the world. Unfortunately, it has moved on to grander schemes before solving these simple problems.
Mark, I agree with most of your statements, but I might suggest that another way to look at the problem is as a URI resolution problem.
Rather than trying to describe a "mirror" for a link, I think a more consistent approach is to provide pointers to a URI resolver for that link.
In the Content-Addressable Web, we use the THTTP resolvers specified in RFC 2169 to do things like:
I2N - From a given URI, discover a URN. So we take http://foo.com/bigfile.tgz and discover urn:sha1:ASDLFKJWEROI23U4OKASJDFLAKSJFD
I2Ls - From a given URI, discover alternate URLs. So we take the above 'urn:sha1:ASDLFKJWEROI23U4OKASJDFLAKSJFD' and discover a list of URLs that mirror the data specified by that URN.
Thus, I think it would be very useful to use RDF to describe URI resolution services, as well as serializing equivilence relationships between URIs.
BTW, we ran into problems very similar to these with metadata at Akamai; one of the outcomes of that was URISpace (http://www.w3.org/TR/urispace.html). It's probably too heavyweight for your particular project, but you might find it interesting. I've been thinking of expressing entire Web site configurations (Apache .conf files, P3P.xml, robots info, etc.) in URISpace, and then writing a transform to create the appropriate configuration files from one source. Anybody interested?
Very nice. This solves a problem that I was beating my head against for describing a given resolver service as pertaining to a set of URIs.
Could you briefly describe the differences/similarities between this and XPath???
Thanks,
-- Justin Chapweske, Onion Networks http://onionnetworks.com/
