- From: Steve Harris <steve.harris@garlik.com>
- Date: Wed, 19 Dec 2012 19:14:14 +0000
- To: Henry Story <henry.story@bblfish.net>
- Cc: Lee Feigenbaum <lee@thefigtrees.net>, Pat Hayes <phayes@ihmc.us>, David Booth <david@dbooth.org>, semantic-web <semantic-web@w3.org>
On 19 Dec 2012, at 18:27, Henry Story wrote:
>
> On 19 Dec 2012, at 19:10, Steve Harris <steve.harris@garlik.com> wrote:
>
>> On 2012-12-19, at 17:50, Henry Story wrote:
>>>
>>> On 19 Dec 2012, at 18:43, Steve Harris <steve.harris@garlik.com> wrote:
>>>
>>>> On 2012-12-19, at 16:36, Lee Feigenbaum wrote:
>>>>>> Henry Story Wrote:
>>>>>> In any case otherwise you end up with names that are just complicated blank nodes, and you
>>>>>> then have exactly the same problem as blank nodes, except you just end up growning and
>>>>>> growing your names as you go along.
>>>>>
>>>>> Well, except they don't have the same problems as blank nodes: UUID URIs are stable from one query to the next and can be linked to and referenced across document/database-context.
>>>>
>>>> Yes, this is the key problem with bNodes, which means you have to be /really/ careful about how and when you use them.
>>>
>>> No, its' the opposite. This is a key problem with UUIDs as I argued in my later mail
>>> http://lists.w3.org/Archives/Public/semantic-web/2012Dec/0097.html
>>
>> Yes, but I don't buy your arguments.
>>
>> You can't "prove" that you "created" some http: URI either, unless the document is signed by an unrevoked key, and that works just as well for any kind of URI.
>
> The point is that there is a way one can come to agree what the definition of a term means for http
> URIs. You GET it.
As Lee says below, you can GET some UUID-based URIs too.
> There really is no way to do so for a UUID. If two people dispute the meaning of the term, there is no
> way you can come to decide on who was right to use it that way, since either could have come
> to mint it. But next, even if you really worked hard on it, how would you know what the meaning
> of the term was?
>
> And all of that needs to be put into context of what a machine can do reasonably easily. Whatever
> the proof procedure for finding the meaning of a UUID is it's not something that is going to be doable
> automatically. It would require expert police officers, inquisitions, highly specialised teams to work
> out what is what in there, with access to hardware etc…
I agree with this bit, but I don't think a machine can reasonably easily resolve a dispute about the meaning of a dereferencable URI, just by dereferencing it, and doing some computation on the result. I'd love to be proved wrong though. The signed doc case is reasonably easy - as long as you trust the veracity of the private key (it's all degrees of trust). It's still just a claim though.
Noting stops someone (well, some legal and technical issues!) from publishing data from your domain, using "your" URIs, it would be very hard for a machine to tell that someone had done that. It's unlikely in 2012, but far more likely to happen than a UUID clash.
> Don't forget that I am responding to the following:
> "UUID URIs are stable from one query to the next and can be linked to and referenced across document/database-context."
>
> The name is stable yes, and there are advantages to that, but the meaning is not going to
> be understood, since you have no clear way of telling two divergent meanings apart. So they
> are not really as linkable as you think.
I don't /think/ that's different for any other kind of URI though.
>> Also, you say "If you use a UUID you could accidentally make a UUID that someone else has already used." well, it's either not a UUID (e.g. a bogus implementation) or there's some statistically insignificant chance http://en.wikipedia.org/wiki/Universally_unique_identifier#Random_UUID_probability_of_duplicates
>> neither of those cases is very relevant.
>
> Well in one case you have no chance of making a mistake (bnodes), in the other you have what you think is
> a statistically small chance, but you are not taking into account bad faith. Those are not at all the same
> thing. It's the difference between a mathematical truth that is necessarily true, and one that is contingent.
There's a gap between a mathematical definition, and the actions of humans.
If we have a TriG document like:
<A> {
_:x543543df a <Foo> .
...
}
…
<B> {
_:x543543df a <Bar> .
…
}
(suppose it was generated by some buggy process, or a typo, or whatever)
Then those two bNodes become conflated in the dataset.
The mathematical definition doesn't enter into it, it's just human error - or malicious - or whatever.
If you only use [], then it can only happen because of typos, or bugs, but it can still happen.
>>> Not every thing that looks like a URI really works like one. For example file:///... URIs
>>> usually are not global identifiers, and even though software accepts it, it's just a hack
>>> people use to get around software that forces them into this kind of situation.
>>
>> There are valid uses for file: URIs, but yes, you have to be careful.
>>
>>> UUIDs are not a good way to go. They make it look like there is agreement, when in fact
>>> conceptually things are just as broken.
>>
>> How does that relate to bNodes? Software doesn't [typically :)] have opinions about the appearance of identifiers.
>
> The point is that people use things that look like global identifiers, because some software
> shoe-horns them into providing things that look like they are identifiers, even though
> they don't really function in the right way. So if you forced people to use URIs instead
> of bnodes, they'd end up using URIs that were not global identifiers, but that just looked
> like them.
Perhaps.
FWIW I'm not arguing that bNodes should be banned, just that the definition of them is not very useful. I would like to see them as indicators for the processor to replace them by some globally unique identifier - UUIDs is one candidate.
- Steve
Received on Wednesday, 19 December 2012 19:14:35 UTC