- From: Jonathan Rees <jar@creativecommons.org>
- Date: Mon, 16 Mar 2009 23:54:00 +0100
- To: Larry Masinter <masinter@adobe.com>
- Cc: "www-archive@w3.org" <www-archive@w3.org>
Everything you say could be true, but I'm not sure what the point is. You get what you pay for. The problem is, someone in the year 2029 encounters a use of some identifier in some document or database, and wants to know what it refers to. It's a so-called "persistent identifier" to the degree that this is likely to succeed. (Identifiers aren't in themselves persistent or not; it's the possibility of dereferencing them would be. And application of the label "persistent" is always mere wishful thinking; there's no test for it.) Persistence for some number of years can be arranged via an SLA or endowment, and replication is a big help, but as time goes on these machinations all loose their oomph. The web is not the only way to figure out what someone meant by a term; any index or table or database that contains the correct information will do. So the issue has little to do with anything outliving the web. It is merely about whether the (meta)data someone needs will exist in a place they can get to, when they need it. The problem existed pre-web and was solved through a replicated infrastructure (library card catalogs and holdings). If a library burned down, you could usually find what you wanted at another library. (If you think the use of http: syntax for identifiers puts them at a disadvantage relative to urn:, I'm not sure why this should be the case - the syntax shouldn't matter. (At least not for RDF, which as I said should declare independence from the HTTP protocol, while maintaining a sort of opportunistic and nonbinding allegiance.) In any case the choice of URI scheme is a minor problem relative to that of future accessibility.) I don't know how to assess your claim; it may be true or not. But it seems obvious that someone who wants assertions (whether their own or someone else's, it doesn't matter) to be understood at time t knows that the terms used in the assertions have to be understandable at time t. If they know what they're doing they'll take pains to make sure that for each term used either (a) the term belongs to a vocabulary that seems quite likely to be alive at time t, or else (b) information designed to promote understandability is included in the context of the assertion (i.e. in the same file) so that it will be carried along with the assertion as it goes through life (akin to propagating the full citation along with a DOI, even though in principle the DOI by itself is sufficient). Such information could be a "definition" or defining properties, location hints (locations of copies), and/or other stuff. I try to stay away from the "semantic web" movement because it seems to not care about this problem - the implicit assumption is that all assertions are ephemeral. Coming up with credible URIs was the first problem I hit when I started doing RDF, and after three years I'm only now making a little headway on it. Coincidentally, today I had a couple of conversations about the need for open replicable metadata, as a way to make identifier systems more credible, trusted, and likely to persist. (I'm at the International Repositories Workshop in Amsterdam.) By "credible commitments" I meant things like the cool-URIs site policy for w3.org. Because of this, and a bet that w3.org will outlive neurocommons.org, I prefer URIs beginning http://w3.org/ to my those beginning http://neurocommons.org/ (other things being equal). And I figure that by the time ICANN goes sour or w3.org folds, there will be alternative resolution methods, of the sort that is encouraged by URNs (and maybe handles?) and ought to be encouraged for http: as well. Jonathan On Mar 16, 2009, at 2:45 AM, Larry Masinter wrote: > I'm still stuck on the lifetimes of URIs vs. lifetimes > of statements, in engineering the semantic web: > > "... you might be able to > make some plausible predictions or credible commitments.." > > Stuff goes away. Mean time between site failure might be less > than 10 years. Companies change their names, merge, split, > go out of business, stop doing the business that caused them > to bring up the web site. Students graduate. Non-profit > organizations change brands. Web technology itself is > only 20 years old, 20 years from now. Sure, maybe some will > still be around, but on the average, no one has the > foundation or insurance policy to guarantee that a > URI will still be around to respond "200-" to anything > for the expected lifetime of the assertion being made. > > Many industries and applications have a requirement that > the statements made and inferences about them need to last > much longer than 20 years: government documents, descriptions > of building plans, life insurance policies. > > Anyone who wants to make a "semantic web" statement which > need to have meaning beyond the guaranteed lifetime of the > web sites used to form their "ontology" cannot link the > meaning of those statements to the future 200-response > expectation of the referenced web site. The expected > lifetime of any particular piece of web content is much > less than the needed lifetime of the validity of semantics > and understanding of semantic intent. > > I think it is more natural to assume that there are > *no* stable URIs in the long run: every URI has a > lifetime, we wish every one to have as long a life > as possible, but every single URI will, at some point > in the future, evaporate. Consider: > > at any instant, there are: > * People who want to make semantic web assertions P > * assertions that those people want to make > A(p) for p in P > * for each assertion, their desired lifetime > (how long each person wants to make sure the > assertion is interpretable) > D(a) for a in A(p) for p in P > * terms needed in those assertions > T(a) for a in A(p) for p in P > * URIs under the control of those people > which are appropriate > U(t) for t in T(a) for a in A(p) for p in P > * expected lifetime of those URIs > E(u) for u in U(t) for t in T(a) for > a in A(p) for p in P. > > > CLAIM: > > Most people don't have the ability to make > assertions for which the URIs they use have > an expected lifetime longer than the desired > lifetime of all of the assertions they want > to make. > > for large percentage of p in P > there are some assertions a in A(p) > such that for some needed term > t in T(a), such that the desired > lifetime of the asertion D(a) exceeds > the maximum expected lifetime of > all resources available to p. > > > Larry > -- > http://larry.masinter.net
Received on Monday, 16 March 2009 22:54:43 UTC