Re: Person Identifier from Bent Rasmussen on 2008-04-22 (semantic-web@w3.org from April 2008)

From: Bent Rasmussen <incredibleshrinkingsphere@gmail.com>
Date: Tue, 22 Apr 2008 03:59:44 +0200
To: "Booth, David \(HP Software - Boston\)" <dbooth@hp.com>, <al@jku.at>, "Mark Birbeck" <mark.birbeck@x-port.net>
Cc: "Richard Cyganiak" <richard@cyganiak.de>, <semantic-web@w3.org>
Message-ID: <5C199ED37C1943C384E3A407F40171E7@BentPC>
Let's look at some examples

.

Example 1

<http://smith.mr/foaf.rdf> rdf:type foaf:Person

- there is a thing which is said to be of type foaf:Person
- the thing is identified by http://smith.mr/foaf.rdf

.

Example 2

<http://smith.mr/foaf.rdf> rdf:type ont:WebPage

- there is a thing which is said to be of type ont:WebPage
- the thing is, and is identified by, http://smith.mr/foaf.rdf

.

The dangerous assumption is not that the identifier cannot be useful, but that the representation (necessarily) says anything (correct and/or current) about the thing being identified by the identifier. (- I believe.)

.

Example 3

<urn:obscure:f6sjg38gks629fk37sdkd> rdf:type ont:WebPage
<urn:obscure:f6sjg38gks629fk37sdkd> ont:address "http://smith.mr/foaf.rdf"

- there is a thing which is said to be of type ont:WebPage
- the thing is identified by urn:obscure:f6sjg38gks629fk37sdkd
- the thing has the address http://smith.mr/foaf.rdf

.

If one wants to keep the URI more readable, one could use a different convention

urn:uri:http://smith.mr/foaf.rdf

.

Example 4

<urn:uri:http://smith.mr/foaf.rdf> rdf:type foaf:Person
<urn:uri:http://smith.mr/foaf.rdf> ont:address "http://smith.mr/foaf.rdf"

This is so that we do not confuse that a subject identifier is *ever* about a representation of that identifier. This is the danger, I believe.

.

I believe this should be avoided like the plague

<http://smith.mr/foaf.rdf> rdf:type ont:WebPage
<http://smith.mr/foaf.rdf> ont:address <http://smith.mr/foaf.rdf>

It leads to ambiguity because the same identifier is likely to be used for a different purpose.

But I'd rather have this

<urn:uri:http://smith.mr/foaf.rdf> rdf:type foaf:Person
<urn:uri:http://smith.mr/foaf.rdf> rdf:seeAlso <http://smith.mr/foaf.rdf>

Then if the representation for <http://some.thing> has statements about <urn:uri:http://some.thing> then it is plausable that they are consistent.

The problem is the accident of

Source1:

<urn:uri:http://smith.mr/foaf.rdf> rdf:type foaf:Person
<urn:uri:http://smith.mr/foaf.rdf> rdf:seeAlso <http://smith.mr/foaf.rdf>

Source2:

<urn:uri:http://smith.mr/foaf.rdf> rdf:type ont:WebPage
<urn:uri:http://smith.mr/foaf.rdf> rdf:seeAlso <http://smith.mr/foaf.rdf>

Back to square one.

To solve this problem, one might consider either completely arbitrary identifiers (GUIDs, etc) or to use a convention that ensures some level of disambiguation, or rather prevents some level of repeated use of an identifier for different things

<urn:rdf:?type-ns=foaf&type=Person&ref=http://smith.mr/foaf.rdf>
<urn:rdf:?type-ns=ont&type=WebPage&ref=http://smith.mr/foaf.rdf>

But instead of all this mumbo jumbo, it would be much easier to just have scrambled URIs as identifiers, I think.

.

If talking about concrete representations, then a timestamp, temporal interval or a hashcode should be used, I gather; unless of a more general nature about some assumed invariant characteristic.

.

Then there is the question of trust, precedence and relevance. That's too deep for me, I just hope some smart people can sort out that mess. :-)

.

Bent

PS I'm sure the URI/URN syntax is violated here; possibly some logic as well, it's getting late here...




From: Booth, David (HP Software - Boston) 
Sent: Monday, April 21, 2008 11:31 PM
To: al@jku.at ; Mark Birbeck 
Cc: Richard Cyganiak ; Bent Rasmussen ; semantic-web@w3.org 
Subject: RE: Person Identifier



> From: Andreas Langegger
> [ . . . ]
> Another suggestion whould be to inject a protocol hint into the URI by
> the convention of a special sub-domain like
> http://doi.yourdomain.tld/somePath/resource/foo

> From: Andreas Langegger
> [ . . . ]
> What if we find a way to "optionally" make one to find out more about
> the resource and include the type in the URI?

That could be done.  The technique of defining a URI prefix is described here:
http://dbooth.org/2006/urn2http/

> [ . . . ]
> First I also disliked the idea that HTTP URIs should represent non-
> informational resources. The fact, that an URI is usually representing
> a web page is so deeply anchored in our thinking that it just sounds
> too obscure.

Yes, that is a disadvantage of HTTP URIs versus URNs . . . and nearly the *only* disadvantage.  This paper
http://dbooth.org/2006/urn2http/
provides an informal proof-by-construction that HTTP URIs can have greater capability than URNs in nearly all cases.

> I think the remaining argument against URIs everywhere + HTTP only is
> that you may have to do thousands of GET requests for a large KB,

But you are never *required* to do a GET.  The follow-your-nose convention gives you the *possibility* of finding useful information when you do a GET, but: (a) there is no guarantee that it will be successful; and (b) you are not required to try.  But even if you do want to GET useful information, the app can be smart about it, using caching and other techniques to avoid unnecessary GETs.



David Booth, Ph.D.
HP Software
+1 617 629 8881 office  |  dbooth@hp.com
http://www.hp.com/go/software

Opinions expressed herein are those of the author and do not represent the official views of HP unless explicitly stated otherwise.
Received on Tuesday, 22 April 2008 02:00:27 UTC