Everything you say could be true, but I'm not sure what the point is.
You get what you pay for.
The problem is, someone in the year 2029 encounters a use of some
identifier in some document or database, and wants to know what it
refers to. It's a so-called "persistent identifier" to the degree that
this is likely to succeed. (Identifiers aren't in themselves
persistent or not; it's the possibility of dereferencing them would
be. And application of the label "persistent" is always mere wishful
thinking; there's no test for it.)
Persistence for some number of years can be arranged via an SLA or
endowment, and replication is a big help, but as time goes on these
machinations all loose their oomph.
The web is not the only way to figure out what someone meant by a
term; any index or table or database that contains the correct
information will do. So the issue has little to do with anything
outliving the web. It is merely about whether the (meta)data someone
needs will exist in a place they can get to, when they need it. The
problem existed pre-web and was solved through a replicated
infrastructure (library card catalogs and holdings). If a library
burned down, you could usually find what you wanted at another library.
(If you think the use of http: syntax for identifiers puts them at a
disadvantage relative to urn:, I'm not sure why this should be the
case - the syntax shouldn't matter. (At least not for RDF, which as I
said should declare independence from the HTTP protocol, while
maintaining a sort of opportunistic and nonbinding allegiance.) In
any case the choice of URI scheme is a minor problem relative to that
of future accessibility.)
I don't know how to assess your claim; it may be true or not. But it
seems obvious that someone who wants assertions (whether their own or
someone else's, it doesn't matter) to be understood at time t knows
that the terms used in the assertions have to be understandable at
time t. If they know what they're doing they'll take pains to make
sure that for each term used either (a) the term belongs to a
vocabulary that seems quite likely to be alive at time t, or else (b)
information designed to promote understandability is included in the
context of the assertion (i.e. in the same file) so that it will be
carried along with the assertion as it goes through life (akin to
propagating the full citation along with a DOI, even though in
principle the DOI by itself is sufficient). Such information could be
a "definition" or defining properties, location hints (locations of
copies), and/or other stuff.
I try to stay away from the "semantic web" movement because it seems
to not care about this problem - the implicit assumption is that all
assertions are ephemeral. Coming up with credible URIs was the first
problem I hit when I started doing RDF, and after three years I'm only
now making a little headway on it.
Coincidentally, today I had a couple of conversations about the need
for open replicable metadata, as a way to make identifier systems more
credible, trusted, and likely to persist. (I'm at the International
Repositories Workshop in Amsterdam.)
By "credible commitments" I meant things like the cool-URIs site
policy for w3.org. Because of this, and a bet that w3.org will
outlive neurocommons.org, I prefer URIs beginning http://w3.org/ to my
those beginning http://neurocommons.org/ (other things being equal).
And I figure that by the time ICANN goes sour or w3.org folds, there
will be alternative resolution methods, of the sort that is encouraged
by URNs (and maybe handles?) and ought to be encouraged for http: as
well.
Jonathan
On Mar 16, 2009, at 2:45 AM, Larry Masinter wrote:
I'm still stuck on the lifetimes of URIs vs. lifetimes
of statements, in engineering the semantic web:
"... you might be able to
make some plausible predictions or credible commitments.."
Stuff goes away. Mean time between site failure might be less
than 10 years. Companies change their names, merge, split,
go out of business, stop doing the business that caused them
to bring up the web site. Students graduate. Non-profit
organizations change brands. Web technology itself is
only 20 years old, 20 years from now. Sure, maybe some will
still be around, but on the average, no one has the
foundation or insurance policy to guarantee that a
URI will still be around to respond "200-" to anything
for the expected lifetime of the assertion being made.
Many industries and applications have a requirement that
the statements made and inferences about them need to last
much longer than 20 years: government documents, descriptions
of building plans, life insurance policies.
Anyone who wants to make a "semantic web" statement which
need to have meaning beyond the guaranteed lifetime of the
web sites used to form their "ontology" cannot link the
meaning of those statements to the future 200-response
expectation of the referenced web site. The expected
lifetime of any particular piece of web content is much
less than the needed lifetime of the validity of semantics
and understanding of semantic intent.
I think it is more natural to assume that there are
*no* stable URIs in the long run: every URI has a
lifetime, we wish every one to have as long a life
as possible, but every single URI will, at some point
in the future, evaporate. Consider:
at any instant, there are:
* People who want to make semantic web assertions P
* assertions that those people want to make
A(p) for p in P
* for each assertion, their desired lifetime
(how long each person wants to make sure the
assertion is interpretable)
D(a) for a in A(p) for p in P
* terms needed in those assertions
T(a) for a in A(p) for p in P
* URIs under the control of those people
which are appropriate
U(t) for t in T(a) for a in A(p) for p in P
* expected lifetime of those URIs
E(u) for u in U(t) for t in T(a) for
a in A(p) for p in P.
CLAIM:
Most people don't have the ability to make
assertions for which the URIs they use have
an expected lifetime longer than the desired
lifetime of all of the assertions they want
to make.
for large percentage of p in P
there are some assertions a in A(p)
such that for some needed term
t in T(a), such that the desired
lifetime of the asertion D(a) exceeds
the maximum expected lifetime of
all resources available to p.
Larry
--
http://larry.masinter.net