Re: Why URIs in RDF? from Sandro Hawke on 2003-01-15 (www-tag@w3.org from January 2003)

From: Sandro Hawke <sandro@w3.org>
Date: Wed, 15 Jan 2003 17:34:59 -0500
To: Jan Algermissen <algermissen@acm.org>
cc: www-tag@w3.org
Message-Id: <200301152235.h0FMYxY21169@wadimousa.hawke.org>
> I am not sure if it is appropriate to post to this list, apologies if it
> isn't. I have only a short question to help me understand these important
> issues.

Posting to the list has the advantage that other people benefit from
the dialog.  For future reference: if you feel like it's really not
going to be of interest to someone reading this list, then please at
least cc: www-archive@w3.org, so I and others can refer to the e-mail.

> Sandro Hawke wrote:
> 
> > My points:
> > 
> >   1.  Producers and consumers of RDF information need to know what
> >       many of the URIs in a document denote, if useful communication
> >       is to occur.  They can learn that nicely by doing a web lookup
> >       on the URIs.
> 
> But isn't the whole point that one cannot rely on getting a particular document
> (representation) when "doing a web lookup on the URI"? 
> 
> So, what if the representation I GET back for a URI at time T1 denotes another
> meaning than the representation I GET back for the same URI at time T2? Doesn't that
> mean I can't rely on the meaning of the URI to be constistent over time?
> 
> Does the RDF-approach of using URIs for KR consider this? Can you explain how,
> or send me a pointer to a description?

There are several RDF approaches, and the most official one (the
current RDF Core Working Drafts) is perhaps too vague to be of much
help.  Graham Klyne (an editor of that document) has argued that's
for the best, and perhaps he's right.   The working group has to reach
something like consensus; I just have to convince you.  :)

The approach I am here advocating is this: URIs are strings which are
useful as identifiers in two different ways in RDF.  In "page mode"
the string identifies a maintainable information repository, a sort of
shared-memory location.  In "subject mode" the string identifies an
arbitray thing which is the primary subject of the maintainable
information repository.  (Of course, not all strings have a page mode
denotation, and of those, only some have a subject mode denotation.)

When you http GET the URI, the returned MIME entity is a serialization
of the current content of the information repository in some
negotiated language.  That serialization conveys some information
about the primary subject of the repository.  The english word
"representation" is flexible enough that it's reasonable to say the
MIME entity is (1) a representation of the current content of the
information repository *and* (2) a representation of the primary
subject of that repository.  So either way, it's okay to say the
returned entity is a representation of whatever the URI denotes.  
[ (1) is representation in the KR sense, where bit-strings represent
knowledge; (2) is representation in the sense that a photograph of a
person is a representation of that person. ]

Now to your question.   I think the correct characterization is that
the page-mode and subject-mode mappings SHOULD NOT change over time.
This is the formal version of Cool URIs Don't Change [1].  People
may write things about the repository ("It's full of good stuff", 
"the privacy policy is evil", "I have a copy of the contents from
2001-02-01T00:04") or about the subject ("I have one of these for
sale", "This is my sister", "This will appear over the horizon at
6:14am (on some day at some location)").  

If either mapping changes too much, then someone may have been made
into a liar.  This isn't very nice, but for various social reasons it
happens on the normal web, too.  I had a nice website which I forgot
to pay the bills on, and it got taken away from me.  I don't know who
might have linked to it, but if anyone had a page with a link saying
"This is Sandro's site", the combination of my negligence and
ICANN/Verisign's policies made their content become false.  Sad but
true.  I think it's the price we pay for having *maintainable*
repositories.

The content of the repository, the information about the subject,
certainly MAY change.  In many cases it should change often.  But
that's an entirely different matter.

I suspect security-conscious semantic web applications will use a
combination of public-key cryptography (to make sure they are hearing
from the repository's authorized servers) and secure hash functions
(for when they simply want to refer to static content).

For instance, do you want to use WebOnt's latest Recommended version
of owl:disjointUnionOf?  Then you look for the ontology signed by
WebOnt.  Do you want to use version 3.4?  Then you can just use the
SHA1 of the 3.4 specs, so even WebOnt can't make a liar of you.  The
techical details are trivial and left as an excercise for the reader.
:-)

     -- sandro

[1] http://www.w3.org/Provider/Style/URI
Received on Wednesday, 15 January 2003 17:36:02 UTC