Re: Well Behaved RDF - Taming Blank Nodes, etc. from Henry Story on 2012-12-19 (semantic-web@w3.org from December 2012)

From: Henry Story <henry.story@bblfish.net>
Date: Wed, 19 Dec 2012 20:05:32 +0100
To: Lee Feigenbaum <lee@thefigtrees.net>
Cc: Steve Harris <steve.harris@garlik.com>, Pat Hayes <phayes@ihmc.us>, David Booth <david@dbooth.org>, semantic-web <semantic-web@w3.org>
Message-Id: <5B55BC5A-91F3-42DC-BE94-7240B8EC62CD@bblfish.net>
On 19 Dec 2012, at 19:47, Lee Feigenbaum <lee@thefigtrees.net> wrote:

> Henry, could I accurately summarize your reasons for using blank nodes as:
> 
> 1. I can't trust the data other people publish.

Not really. You can trust some, and not other data, but that was not the argument I put forward.

> 2. Blank nodes mean that no one else can say (false) things about my data.

Not at all. Blank nodes means there is no intension in the name. Or its intension is well
known: it is an existensial claim and nothing more. Everything else will only be something
you can deduce from the position of that object in the local graph.

> 3. (Also, HTTP URIs mean that I can easily resolve disputes in what something should mean.)

with HTTP URIs you have at least the beginning of something you can do to settle
disputes. Or rather you have something of a grasp on the meaning. 

The point of URIs to name things is that it is meant to help you merge information in a global
space. But UUIDs look like they give you that but they don't, because you have to end
up saying: the UUID as used by web site X or web site Y or .... So you are slowly going to
be pushed back to something like bnodes, and now things are complicated: because you'll then
have bnodes, URIs and intermediate things...

> Is that fair?
> 
> If it is fair, let's go back for a second to the first time I brought up UUIDs:
> 
> You said:
> 
> """
> 
> The web creates such resources all the time: whenever you POST a form
> and the server does not give you a URI for the returned result, you have in
> fact created a resource that has not got a name. This resource will need to be
> described using a blank node. It could be described well enough as a creation of
> that form, and with date and time, but giving it a name on that server would be
> an error, and forcing oneself to name a remote resource, when the remote owner
> did not want to do that is more work than you may want.
> """
> 
> I said:
> 
> """
> 
> What's wrong with generating a UUID-based URI for a POST request?
> """

And you mention "UUID-based URI" above ( in answer to a point you make below )

> And you said:
> 
> """
> 
> There's nothing wrong per se. But it is more prone to a mistake happening
> that having a blank node. The blank node is the easiest way to deal with
> things that don't have names.
> """
> 
> It's somewhat clear to me from the rest of this conversation that you've been assuming <urn:uuid:...> type URIs, since (if my above summary is accurate), your objections come down to not being able to resolve a URI for trust/proof purposes. While I use <urn:uuid:...> URIs at times, I also often use http://-based URIs that incorporate UUIDs into them. Do you have objections to this practice as well? (That is, I could identify each POST request that comes in with a URI of the form http://my.domain.com/{UUID}

You can always mint your own URIs in your own domain space of course.

> . (At which point I could of course make them resolvable if potential trust/proof issues were important to my data/application.)

yes. That's fine.

It still remains that the web as it is creates resources without names. So bnodes are still justified
it seems to me. One should use with caution, just as one should use URIs with caution of course.

Now one could go further and see what are the fundamental differences between URLs and blank nodes. Perhaps that would be useful.

1. With bnodes there is no intension associated with the name.
2. with well functioning URIs there is an intension associated with it, such that you can GET their meaning in the document in which they are defined - and there relative URIs play an important role!

I think getting at the consequences of these distinctions would be very fruitful, and you;ll find that these are just different tools in your tool box. Perhaps the use of these tools needs to be explained more carefully.

> 
> Lee
> 
> 
> On 12/19/2012 1:27 PM, Henry Story wrote:
>> On 19 Dec 2012, at 19:10, Steve Harris <steve.harris@garlik.com> wrote:
>> 
>>> On 2012-12-19, at 17:50, Henry Story wrote:
>>>> On 19 Dec 2012, at 18:43, Steve Harris <steve.harris@garlik.com> wrote:
>>>> 
>>>>> On 2012-12-19, at 16:36, Lee Feigenbaum wrote:
>>>>>>> Henry Story Wrote:
>>>>>>> In any case otherwise you end up with names that are just complicated blank nodes, and you
>>>>>>> then have exactly the same problem as blank nodes, except you just end up growning and
>>>>>>> growing your names as you go along.
>>>>>> Well, except they don't have the same problems as blank nodes: UUID URIs are stable from one query to the next and can be linked to and referenced across document/database-context.
>>>>> Yes, this is the key problem with bNodes, which means you have to be /really/ careful about how and when you use them.
>>>> No, its' the opposite. This is a key problem with UUIDs as I argued in my later mail
>>>> http://lists.w3.org/Archives/Public/semantic-web/2012Dec/0097.html
>>> Yes, but I don't buy your arguments.
>>> 
>>> You can't "prove" that you "created" some http: URI either, unless the document is signed by an unrevoked key, and that works just as well for any kind of URI.
>> The point is that there is a way one can come to agree what the definition of a term means for http
>> URIs. You GET it.
>> 
>> There really is no way to do so for a UUID. If two people dispute the meaning of the term, there is no
>> way you can come to decide on who was right to use it that way, since either could have come
>> to mint it. But next, even if you really worked hard on it, how would you know what the meaning
>> of the term was?
>> 
>> And all of that needs to be put into context of what a machine can do reasonably easily. Whatever
>> the proof procedure for finding the meaning of a UUID is it's not something that is going to be doable
>> automatically. It would require expert police officers, inquisitions, highly specialised teams to work
>> out what is what in there, with access to hardware etc...
>> 
>> Don't forget that I am responding to the following:
>> "UUID URIs are stable from one query to the next and can be linked to and referenced across document/database-context."
>> 
>> The name is stable yes, and there are advantages to that, but the meaning is not going to
>> be understood, since you have no clear way of telling two divergent meanings apart. So they
>> are not really as linkable as you think.
>> 
>>> Also, you say "If you use a UUID you could accidentally make a UUID that someone else has already used." well, it's either not a UUID (e.g. a bogus implementation) or there's some statistically insignificant chance http://en.wikipedia.org/wiki/Universally_unique_identifier#Random_UUID_probability_of_duplicates
>>> neither of those cases is very relevant.
>> Well in one case you have no chance of making a mistake (bnodes), in the other you have what you think is
>> a statistically small chance, but you are not taking into account bad faith. Those are not at all the same
>> thing. It's the difference between a mathematical truth that is necessarily true, and one that is contingent.
>> 
>> 
>>>> Not every thing that looks like a URI really works like one. For example file:///... URIs
>>>> usually are not global identifiers, and even though software accepts it, it's just a hack
>>>> people use to get around software that forces them into this kind of situation.
>>> There are valid uses for file: URIs, but yes, you have to be careful.
>>> 
>>>> UUIDs are not a good way to go. They make it look like there is agreement, when in fact
>>>> conceptually things are just as broken.
>>> How does that relate to bNodes? Software doesn't [typically :)] have opinions about the appearance of identifiers.
>> The point is that people use things that look like global identifiers, because some software
>> shoe-horns them into providing things that look like they are identifiers, even though
>> they don't really function in the right way.  So if you forced people to use URIs instead
>> of bnodes, they'd end up using URIs that were not global identifiers, but that just looked
>> like them.
>> 
>>> - Steve
>>> 
>>>>> We have local hacks to get round the issue, but that's not great. RDF 1.1 bNode skolemisation provides a more generic solution.
>>>> Something I'd have to look into.
>>>> 
>>>>> - Steve
>>>> A short message from my sponsors: Vive la France!
>>>> Social Web Architect
>>>> http://bblfish.net/
>>>> 
>>> -- 
>>> Steve Harris, CTO
>>> Garlik, a part of Experian
>>> +44 20 3042 4132  http://www.garlik.com/
>>> Registered in England and Wales 653331 VAT # 887 1335 93
>>> 80 Victoria Street, London, SW1E 5JL
>>> 
>> A short message from my sponsors: Vive la France!
>> Social Web Architect
>> http://bblfish.net/
>> 
> 

A short message from my sponsors: Vive la France!
Social Web Architect
http://bblfish.net/
Attachments

application/pkcs7-signature attachment: smime.p7s
Received on Wednesday, 19 December 2012 19:06:10 UTC