Re: stuff goes away

Jonathan Rees Mon, 16 Mar 2009 15:54:56 -0700

Everything you say could be true, but I'm not sure what the point is.You get what you pay for.

The problem is, someone in the year 2029 encounters a use of someidentifier in some document or database, and wants to know what itrefers to. It's a so-called "persistent identifier" to the degree thatthis is likely to succeed. (Identifiers aren't in themselvespersistent or not; it's the possibility of dereferencing them wouldbe. And application of the label "persistent" is always mere wishfulthinking; there's no test for it.)

Persistence for some number of years can be arranged via an SLA orendowment, and replication is a big help, but as time goes on thesemachinations all loose their oomph.

The web is not the only way to figure out what someone meant by aterm; any index or table or database that contains the correctinformation will do. So the issue has little to do with anythingoutliving the web. It is merely about whether the (meta)data someoneneeds will exist in a place they can get to, when they need it. Theproblem existed pre-web and was solved through a replicatedinfrastructure (library card catalogs and holdings). If a libraryburned down, you could usually find what you wanted at another library.

(If you think the use of http: syntax for identifiers puts them at adisadvantage relative to urn:, I'm not sure why this should be thecase - the syntax shouldn't matter. (At least not for RDF, which as Isaid should declare independence from the HTTP protocol, whilemaintaining a sort of opportunistic and nonbinding allegiance.) Inany case the choice of URI scheme is a minor problem relative to thatof future accessibility.)

I don't know how to assess your claim; it may be true or not. But itseems obvious that someone who wants assertions (whether their own orsomeone else's, it doesn't matter) to be understood at time t knowsthat the terms used in the assertions have to be understandable attime t. If they know what they're doing they'll take pains to makesure that for each term used either (a) the term belongs to avocabulary that seems quite likely to be alive at time t, or else (b)information designed to promote understandability is included in thecontext of the assertion (i.e. in the same file) so that it will becarried along with the assertion as it goes through life (akin topropagating the full citation along with a DOI, even though inprinciple the DOI by itself is sufficient). Such information could bea "definition" or defining properties, location hints (locations ofcopies), and/or other stuff.

I try to stay away from the "semantic web" movement because it seemsto not care about this problem - the implicit assumption is that allassertions are ephemeral. Coming up with credible URIs was the firstproblem I hit when I started doing RDF, and after three years I'm onlynow making a little headway on it.

Coincidentally, today I had a couple of conversations about the needfor open replicable metadata, as a way to make identifier systems morecredible, trusted, and likely to persist. (I'm at the InternationalRepositories Workshop in Amsterdam.)

By "credible commitments" I meant things like the cool-URIs sitepolicy for w3.org. Because of this, and a bet that w3.org willoutlive neurocommons.org, I prefer URIs beginning http://w3.org/ to mythose beginning http://neurocommons.org/ (other things being equal).And I figure that by the time ICANN goes sour or w3.org folds, therewill be alternative resolution methods, of the sort that is encouragedby URNs (and maybe handles?) and ought to be encouraged for http: aswell.


Jonathan

On Mar 16, 2009, at 2:45 AM, Larry Masinter wrote:

I'm still stuck on the lifetimes of URIs vs. lifetimes
of statements, in engineering the semantic web:

"... you might be able to
make some plausible predictions or credible commitments.."

Stuff goes away. Mean time between site failure might be less
than 10 years. Companies change their names, merge, split,
go out of business, stop doing the business that caused them
to bring up the web site. Students graduate. Non-profit
organizations change brands. Web technology itself is
only 20 years old, 20 years from now. Sure, maybe some will
still be around, but on the average, no one has the
foundation or insurance policy to guarantee that a
URI will still be around to respond "200-" to anything
for the expected lifetime of the assertion being made.

Many industries and applications have a requirement that
the statements made and inferences about them need to last
much longer than 20 years: government documents, descriptions
of building plans, life insurance policies.

Anyone who wants to make a "semantic web" statement which
need to have meaning beyond the guaranteed lifetime of the
web sites used to form their "ontology" cannot link the
meaning of those statements to the future 200-response
expectation of the referenced web site. The expected
lifetime of any particular piece of web content is much
less than the needed lifetime of the validity of semantics
and understanding of semantic intent.

I think it is more natural to assume that there are
*no* stable URIs in the long run: every URI has a
lifetime, we wish every one to have as long a life
as possible, but every single URI will, at some point
in the future, evaporate. Consider:

at any instant, there are:
* People who want to make semantic web assertions P
* assertions that those people want to make
  A(p) for p in P
* for each assertion, their desired lifetime
 (how long each person wants to make sure the
 assertion is interpretable)
   D(a) for a in A(p) for p in P
* terms needed in those assertions
   T(a) for a in A(p) for p in P
* URIs under the control of those people
 which are appropriate
   U(t) for t in T(a) for a in A(p) for p in P
* expected lifetime of those URIs
   E(u) for u in U(t) for t in T(a) for
  a in A(p) for p in P.


CLAIM:

Most people don't have the ability to make
assertions for which the URIs they use have
an expected lifetime longer than the desired
lifetime of all of the assertions they want
to make.

for large percentage of p in P
there are some assertions a in A(p)
such that for some needed term
t in T(a), such that the desired
lifetime of the asertion D(a) exceeds
the maximum expected lifetime of
all resources available to p.


Larry
--
http://larry.masinter.net

Re: stuff goes away

Reply via email to