Extreme RDF, URIs and topic maps from Paul Prescod on 2002-08-09 (www-tag@w3.org from August 2002)

From: Paul Prescod <paul@prescod.net>
Date: Fri, 09 Aug 2002 03:53:48 -0400
To: TAG <www-tag@w3.org>, "Steven R. Newcomb" <srn@coolheads.com>, cmsmcq@w3.org, Dan Connolly <connolly@w3.org>, liam@w3.org
Message-ID: <3D53750C.66D2695B@prescod.net>

I won't have time to catch up on TAG until after the Extreme Markup
conference. But there is an issue Steven Newcomb has raised at the
conference which I believe (believe!) relates to the current
discussions.

His claim is that Topic Maps have a requirement (which seems reasonable
to me) that when two different people make assertions about the same
information resource that he be able to reliably recognize that those
assertions are about the same information resource in a standard way, no
matter what addressing syntax was used.

Please note the phrase "information resource". It is necessary and
important to be able to make assertions about cars and dogs and humans
but it is *impossible* for computers to reliably "know" the identity of
real-world objects else Lois Lane could type in the question somewhere
"Is Clark Kent also known as Superman" and get a correct answer. I hope
this never happens!

But on the other hand, information resources are represented as bits on
disks or in memory and by virtue of that, they have an objective
identity accessible to computers.

Consider an example:

http://www.foo/somedocument.xml#bar IS SECRET
http://www.foo/somedocument.xml#xpointer(//*[@id='bar']) IS IMPORTANT

Now I want to use an RDF query engine to ask the question: is the
element #bar both SECRET and IMPORTANT? 

I do not believe that RDF requires implementations to load document.xml
and determine whether the referents are the same or not. I think that
some implementations might and some might not and therefore they might
give different answers to the question.

On the other hand, if you claim that even two URI References that
reference the same element could none-the-less refer to different
abstract objects then you will not accept that the two assertions apply
to the same sub-resource. But if this is true then RDF's model is quite
incompatible with that of Topic Maps and is probably not suitable as a
basis for Topic Maps.

Steve also argues that this reliability should be provided for all URI
references containing fragment identifiers, not just those which refer
to an obvious concept of node, like references into XML. Therefore the
"range" of fragment identifiers should be formalized in such a way that
we can say that "the things addressed by them (call them sub-resources,
nodes, whatever) have identity." The grove model is one way to so this:

 * http://www.prescod.net/groves/shorttut/

In the Web world we would probably say that all media types have behind
them infosets, infoset items have identity and fragment identifiers
identify infosets.

Finally, Steve argues that there should be a standard way to ask a web
server (or "the Web infrastructure") whether two HTTP URIs are synonyms
for the same thing. I disagree with him on this one. Each "thing" has
one and only one name. Ignoring performance, the only reason to give the
same bits multiple names is to pull the Superman/Clark Kent trick.
Unfortunately it is quite common for Web software to give multiple names
to the "same thing" (/foo, /foo/, /foo/index.html). I think that this
should be discouraged.
-- 
XML, Web Services Architecture, REST Architectural Style
Consulting, training, programming: http://www.constantrevolution.com
Come discuss XML and REST web services at the Extreme Markup Conference

Received on Friday, 9 August 2002 03:56:23 UTC