Re: Well Behaved RDF - Taming Blank Nodes, etc. from Lee Feigenbaum on 2012-12-19 (semantic-web@w3.org from December 2012)

From: Lee Feigenbaum <lee@thefigtrees.net>
Date: Wed, 19 Dec 2012 13:47:35 -0500
To: Henry Story <henry.story@bblfish.net>
CC: Steve Harris <steve.harris@garlik.com>, Pat Hayes <phayes@ihmc.us>, David Booth <david@dbooth.org>, semantic-web <semantic-web@w3.org>
Message-ID: <50D20BC7.3080607@thefigtrees.net>
Henry, could I accurately summarize your reasons for using blank nodes as:

1. I can't trust the data other people publish.
2. Blank nodes mean that no one else can say (false) things about my data.
3. (Also, HTTP URIs mean that I can easily resolve disputes in what 
something should mean.)

Is that fair?

If it is fair, let's go back for a second to the first time I brought up 
UUIDs:

You said:

"""

The web creates such resources all the time: whenever you POST a form
and the server does not give you a URI for the returned result, you have in
fact created a resource that has not got a name. This resource will need to be
described using a blank node. It could be described well enough as a creation of
that form, and with date and time, but giving it a name on that server would be
an error, and forcing oneself to name a remote resource, when the remote owner
did not want to do that is more work than you may want.
"""

I said:

"""

What's wrong with generating a UUID-based URI for a POST request?
"""

And you said:

"""

There's nothing wrong per se. But it is more prone to a mistake happening
that having a blank node. The blank node is the easiest way to deal with
things that don't have names.
"""

It's somewhat clear to me from the rest of this conversation that you've 
been assuming <urn:uuid:...> type URIs, since (if my above summary is 
accurate), your objections come down to not being able to resolve a URI 
for trust/proof purposes. While I use <urn:uuid:...> URIs at times, I 
also often use http://-based URIs that incorporate UUIDs into them. Do 
you have objections to this practice as well? (That is, I could identify 
each POST request that comes in with a URI of the form 
http://my.domain.com/{UUID} . (At which point I could of course make 
them resolvable if potential trust/proof issues were important to my 
data/application.)

Lee


On 12/19/2012 1:27 PM, Henry Story wrote:
> On 19 Dec 2012, at 19:10, Steve Harris <steve.harris@garlik.com> wrote:
>
>> On 2012-12-19, at 17:50, Henry Story wrote:
>>> On 19 Dec 2012, at 18:43, Steve Harris <steve.harris@garlik.com> wrote:
>>>
>>>> On 2012-12-19, at 16:36, Lee Feigenbaum wrote:
>>>>>> Henry Story Wrote:
>>>>>> In any case otherwise you end up with names that are just complicated blank nodes, and you
>>>>>> then have exactly the same problem as blank nodes, except you just end up growning and
>>>>>> growing your names as you go along.
>>>>> Well, except they don't have the same problems as blank nodes: UUID URIs are stable from one query to the next and can be linked to and referenced across document/database-context.
>>>> Yes, this is the key problem with bNodes, which means you have to be /really/ careful about how and when you use them.
>>> No, its' the opposite. This is a key problem with UUIDs as I argued in my later mail
>>> http://lists.w3.org/Archives/Public/semantic-web/2012Dec/0097.html
>> Yes, but I don't buy your arguments.
>>
>> You can't "prove" that you "created" some http: URI either, unless the document is signed by an unrevoked key, and that works just as well for any kind of URI.
> The point is that there is a way one can come to agree what the definition of a term means for http
> URIs. You GET it.
>
> There really is no way to do so for a UUID. If two people dispute the meaning of the term, there is no
> way you can come to decide on who was right to use it that way, since either could have come
> to mint it. But next, even if you really worked hard on it, how would you know what the meaning
> of the term was?
>
> And all of that needs to be put into context of what a machine can do reasonably easily. Whatever
> the proof procedure for finding the meaning of a UUID is it's not something that is going to be doable
> automatically. It would require expert police officers, inquisitions, highly specialised teams to work
> out what is what in there, with access to hardware etc...
>
> Don't forget that I am responding to the following:
> "UUID URIs are stable from one query to the next and can be linked to and referenced across document/database-context."
>
> The name is stable yes, and there are advantages to that, but the meaning is not going to
> be understood, since you have no clear way of telling two divergent meanings apart. So they
> are not really as linkable as you think.
>
>> Also, you say "If you use a UUID you could accidentally make a UUID that someone else has already used." well, it's either not a UUID (e.g. a bogus implementation) or there's some statistically insignificant chance http://en.wikipedia.org/wiki/Universally_unique_identifier#Random_UUID_probability_of_duplicates
>> neither of those cases is very relevant.
> Well in one case you have no chance of making a mistake (bnodes), in the other you have what you think is
> a statistically small chance, but you are not taking into account bad faith. Those are not at all the same
> thing. It's the difference between a mathematical truth that is necessarily true, and one that is contingent.
>
>
>>> Not every thing that looks like a URI really works like one. For example file:///... URIs
>>> usually are not global identifiers, and even though software accepts it, it's just a hack
>>> people use to get around software that forces them into this kind of situation.
>> There are valid uses for file: URIs, but yes, you have to be careful.
>>
>>> UUIDs are not a good way to go. They make it look like there is agreement, when in fact
>>> conceptually things are just as broken.
>> How does that relate to bNodes? Software doesn't [typically :)] have opinions about the appearance of identifiers.
> The point is that people use things that look like global identifiers, because some software
> shoe-horns them into providing things that look like they are identifiers, even though
> they don't really function in the right way.  So if you forced people to use URIs instead
> of bnodes, they'd end up using URIs that were not global identifiers, but that just looked
> like them.
>
>> - Steve
>>
>>>> We have local hacks to get round the issue, but that's not great. RDF 1.1 bNode skolemisation provides a more generic solution.
>>> Something I'd have to look into.
>>>
>>>> - Steve
>>> A short message from my sponsors: Vive la France!
>>> Social Web Architect
>>> http://bblfish.net/
>>>
>> -- 
>> Steve Harris, CTO
>> Garlik, a part of Experian
>> +44 20 3042 4132  http://www.garlik.com/
>> Registered in England and Wales 653331 VAT # 887 1335 93
>> 80 Victoria Street, London, SW1E 5JL
>>
> A short message from my sponsors: Vive la France!
> Social Web Architect
> http://bblfish.net/
>
Received on Wednesday, 19 December 2012 18:48:01 UTC