Having it Both Ways (and Ending URI Confusion)

The RDF community has had a long-standing confusion about how to use
URIs.  This has spilled over into TAG issue 14 [1].   The debate has
gotten quite painful, and neither the RDF Core WG nor the TAG has been
able to settle it in a way which makes people very happy.

The current compromise (which I have vocally advocated, in a failed
attempt to attain happy consensus) uses URI-References for identifying
things which are not web pages and uses plain HTTP URIs for things
which are.  This has a serious flaw when we consider talking about
HTML or XML fragments: it seems obvious that one could use the
URI-Reference syntax (it's just a kind of URI according to the TAG),
but the meaning is no longer that defined by the HTML and XPointer
specifications; somehow in RDF such URIs now mean something different.
As long as one never combines RDF with other XML using an ID
attribute, the confusion never causes a total train wreck, but the
tracks still cross.

I used to think this was the best we could do, but now I see something
much simpler: we can have it both ways.   Here's an example of the two
ways, in conflict:

1.   A URI names a website:

     <rdf:Description rdf:about="http://www.w3.org/">
        <dc:title>The World Wide Web Consortium</dc:title>
     </rdf:Description>

     The title is the title of a web page, a document.

2.   A URI names an arbitrary thing, by way of a website perhaps
     containing information about it:

     <rdf:Description rdf:about="http://www.w3.org/">
        <org:missionBrief>To Bring the Web to its Full Potential</dc:title>
     </rdf:Description>

     The brief mission statement is a property of the W3C itself (even
     though it does happen to appear on the page, along with lots of
     other information about the consortium).

We can reconcile these two uses and make them coexist in peace by
seeing that we simply have two different relationships between
URI-syntax strings and things they might name.  So we should
use two different RDF properties.   We should say:

3.  <rdf:Description ns:URI="http://www.w3.org/">
       <dc:title>The World Wide Web Consortium</dc:title>
    </rdf:Description>

    There is a thing with the URI http://www.w3.org/ and the dc:title 
    "The World Wide Web Consortium".

4.  <rdf:Description ns:homePageURI="http://www.w3.org/">
       <org:missionStatementBrief>To Bring the Web to its Full Potential
       </org:missionStatementBrief>
    </rdf:Description>

    There is a thing which has a home page with the URI
    http://www.w3.org/ and the org:missionStatementBrief "To Bring the
    Web to its Full Potential".

The first is a description of a web page and the second is a
description of the W3C.   They are each unambiguously identified, but
different reverse-functional properties are used.   The web page is
simply identified by its URI.   The consortium is identified by the
URI of its home page.   There is no confusion as long as we say
explicitely which property we are using.

Of course rdf:about is considered a bit of syntax, not a property.
The first two examples each generate one triple, while the second two
each generate two triples.  I think that's the price we pay for adding
the extra bit saying which technique we're using for identifcation and
grounding via the web.

If the RDF Core WG could pick one of the properties (ns:URI or
ns:homePageURI) as the "true" meaning of rdf:about, then either 3 or 4
would drop back down to one triple.  But I'm afraid either option
leaves a lot of existing RDF out in the cold.  So perhaps we're best
off recognizing that rdf:about is ambiguous and best avoided in favor
of these two new predicates.

   -- sandro

[1] http://www.w3.org/2001/tag/ilist#httpRange-14

Received on Friday, 13 December 2002 09:22:55 UTC