RE: does RDF require understanding all 82 URI schemes? from Bill de hOra on 2001-02-15 (www-rdf-interest@w3.org from February 2001)

From: Bill de hOra <bill@dehora.fsnet.co.uk>
Date: Thu, 15 Feb 2001 00:20:14 -0000
To: "Aaron Swartz" <aswartz@swartzfam.com>, "Graham Klyne" <GK@ninebynine.org>, "Pierre-Antoine CHAMPIN" <champin@bat710.univ-lyon1.fr>
Cc: "RDF interest group" <www-rdf-interest@w3.org>
Message-ID: <DCEBKOHMHCKKIAAPKLLMOEIKCBAA.bill@dehora.fsnet.co.uk>

: -----Original Message-----
: From: www-rdf-interest-request@w3.org
: [mailto:www-rdf-interest-request@w3.org]On Behalf Of Aaron Swartz
: Sent: Wednesday, February 14, 2001 11:25 PM
: To: Bill de hOra; Graham Klyne; Pierre-Antoine CHAMPIN
: Cc: RDF interest group
: Subject: Re: does RDF require understanding all 82 URI schemes?
:
:
: Bill de hOra <bill@dehora.fsnet.co.uk> wrote:
:
: > "Paris" -> data:,Paris
: > "Paris" -> data:,Paris
: > "Paris" -> data:,Paris
: >
: > The first "Paris" was the name of a perfume, the second was the
: name of a god,
: > the third was the name of a city. By converting them all to URIs, I've lost
: > information.
:
: Aaron Swartz:
: No you haven't. You're assuming that the data: series of URIs actually
: represents the resource in whose context they are used. This is not true --
: a data: URI simply represents the string of characters in it's content, not
: the meaning of the characters.

But there is an algorithm for matching URIs precisely because they are
considered to be unique entities: literals are not unique. "Green" is not the
same as "Green" above the syntactic level. It's not safe to convert string
literals (semantically or pragmatically) into URIs because it's not safe to
provide machines with algorithms for matching them.

Two identical literals could stand for very different things, albeit the creator
of the data hasn't made that intent sufficiently obvious to machine. For similar
reasons, XML now provides namespaces to differentiate between elements and
attributes that syntactically are identical, but aren't semantically intended to
be identical.

: Aaron Swartz:
: When I make the statement that:
:
:     <http://aaronsw.com/> dc:title <data:,Aaron%20Swartz>
:
: I'm simply saying that the title is the set of characters: "Aaron Swartz".
: No more, no less. If I want to look at the other things hanging off that
: URI, and can see the other things that this string of characters represents.
: I think this is more information, not less.

But also you've provide an option for matching that URI to another identical
one. Here you've provided a data URI in advance. Mainly, I'm talking about
converting a literal to a URI using data:, form, which is lossy. Perhaps
mapping:

<http //aaronsw.com/> dc:title <data:,Aaron%20Swartz>
<http //aaronsw.com/> dc:title "Aaron Swartz"
<http //aaronsw.com/sw/> dc:title "Aaron Swartz"

to:

<http //aaronsw.com/> dc:title <data:,Aaron%20Swartz>
<http //aaronsw.com/> dc:title <data:,Aaron%20Swartz>
<http //aaronsw.com/sw/> dc:title <data:,Aaron%20Swartz>

is a reasonable inference, but could also be a Terrible Mistake. Also, consider
circumstances where a machine is collecting RDF from different places:

<http //aaronsw.com/> dc:title <data:,Aaron%20Swartz>

You made available the statement, picked up by some machine and stuffed into its
KB. It later picks this up from me:

<http //dehora.com/psycho/aaronswartzshrine> dc:subject <data:,Aaron%20Swartz>

And in turn puts it into its KB. Now those data:, URIs can be matched. But
actually, I'm referring to a different Aaron Swartz altogether. But they've been
conflated by virtue of having identical URIs. That's unfortunate, and it is also
lossy.

I've just seen a post by Seth Russell asking why the data:, form is preferable.
My answer is this: the literal form is preferable in some circumstances
precisely because it can't be normatively unified with another identical
literal.

Bill de hOra

Received on Wednesday, 14 February 2001 19:20:33 UTC