Re: Indicating Skolem Nodes (was Re: AW: {Disarmed} Re: blank nodes (once again)) from Andy Seaborne on 2011-03-27 (semantic-web@w3.org from March 2011)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Sun, 27 Mar 2011 16:30:02 +0100
To: Steve Harris <steve.harris@garlik.com>
CC: Sandro Hawke <sandro@w3.org>, semantic-web@w3.org
Message-ID: <4D8F57FA.8070004@epimorphics.com>
On 25/03/11 18:10, Steve Harris wrote:
> On 2011-03-25, at 16:57, Sandro Hawke wrote:
>
>> On Fri, 2011-03-25 at 16:01 +0000, Steve Harris wrote:
>>> On 2011-03-25, at 15:41, Pat Hayes wrote:
>>>>
>>>> On Mar 25, 2011, at 10:05 AM, Sandro Hawke wrote:
>>>>
>>>>> Thanks for the detailed answer, but I'm pretty sure you're answering a
>>>>> different question than I meant.   (Sorry for not being more clear.)
>>>>> What I meant was: is OWL 2 Full okay with people Skolemizing ontologies
>>>>> they are asserting?
>>>>>
>>>>> I might be misunderstanding, but it seems like all the problems you
>>>>> point out only arise during the entailment check.  And yes, I know you
>>>>> can't Skolemize a query.   I would never even think about doing that.
>>>>> I'm just talking about Skolemizing assertions.
>>>>>
>>>>> I think its general best to do queries in a query language and/or a rule
>>>>> language, but maybe that's a matter of taste.
>>>>>
>>>>> You say, "you never know how someone will use your graph", so I guess
>>>>> the point is that Alice might publish an ontology that gets Skolemized
>>>>> by her system, and then Bob publishes an identical ontology, and then
>>>>> when Charlie comes along and wants to find out whether Bob and Alice's
>>>>> ontologies entail each other, he's going to get a false negative because
>>>>> of the Skolemization.
>>>>
>>>> We can probably even fix this, in fact. If we can reliably distinguish 'bnode URIs' from other URis, eg if they all use a common namespace, then there is an obvious notion of graph equivalence which allows a 1:1 replacement of the skolem URIs.  And then Charlie can discover that, though not logically equivalent, Alice and Bob's graphs are graph-equivalent. People will write code to check things like this if it ever starts to matter to anyone. The cost of testing this is identical to the cost of checking graph equivalence right now (its the same algorithm.)
>>>
>>> Exactly.
>>>
>>> In triplestores I'm familiar (admittedly not that many) with bNodes are skolemised into a value space that's different from both literals and URIs, so this is a natural consequence.
>>
>> So, is there a simple way we can flag them?   I know it's out of scope
>> for the RDF WG to define one, but maybe there's a solution that's so
>> simple everyone can just start doing it without a W3C process.
>
> Yes, 4store uses either:
>
> a)<bnode:b123456>, or
> b)<_:b123456>
>
> neither is what I'd call legitimate though.
>
> a) is an unregistered URI scheme, b) is not syntactically legal (URIs can't start with _), but has the advantage that you can take Turtle-syntax results, and stick<>s round it, which feels somehow appropriate.
>
> These arose out of a pragmatic need to handle FOAF data found in the wild, which is riddled with bNodes.
>
> Of your strawmen below, I prefer 1, and would sugest bnode:. Being pragmatic about it, regardless of whether the URI scheme is official, if people like it, it will spread quickly. Like tag: infact.
>
> I know that some other triplestores also do b), but I'm not going to name and shame them here :)

Steve also wrote:
"""
In triplestores I'm familiar (admittedly not that many) with bNodes are 
skolemised into a value space that's different from both literals and 
URIs, so this is a natural consequence.
"""
and bnode: does not put it into a different value space, only a specific 
corner of URI space ;-)

I prefer to put the bNodes in a clearly separate space, which is why 
using "_:" seems better exactly because it's not a valid URI scheme.  If 
it's fed into a strict IRI checker, it's an error.  The parser can spot 
it before IRI checking and keep it apart.  It's very defintitely a 
skolemiized bnode.

Then, if you want them to dereferencable go ahead and make a 
de-referenceable name (i.e. separate the issues). But as the original 
use of bnodes is quite often convenience of store-generated ids, always 
dereferencable seems a bit heavy.

BTW bNodes get used for simple things like lists and convenience of 
writing [] in Turtle (these can be nested when used in object position). 
  The data is going to look quite ugly (= N-triples and prefixes) 
afterwards.

	Andy
>
> - Steve
>
>>         Strawman 1: make new URI scheme for this
>>
>> Con: very hard to do. (It took me and Tim Kingberg 4+ years to get the
>> "tag:" URI scheme RFC published.  Hopefully it's gotten much easier, but
>> still I'm hesitant.)
>> Con: it wouldn't be a link for linked data
>>
>>         Strawman 2: use urn:uuid:<uuid>
>>
>> Con: there might be some false-positives, because of people using UUIDs
>> who don't mean them like this
>> Con: might be longer than necessary
>> Con: no helpful human-readable element
>> Con: no link for linked data
>>
>>         Strawman 3: use tag:w3.org,2000:Skolem:<some optional
>>         text>:<uuid>
>>
>> Con: no link for linked data
>> Con: might be longer than necessary
>>
>>         Strawman 4: use any IRI with some magic string in it, like
>>         "SkBNode" or "$+SKNB+$".
>>
>> Con: some false positives, as magic string may appear in a few IRIs
>> where it was not intended (such as blog posts about the concept, which
>> use it in the title, or other naive machine generated URLs).
>>
>> For me, the clear winner is Strawman 4, because I really like being able
>> to dereference stuff, even if it's a Skolem constant.  This allows the
>> Skolemizer to provide web service if it wants to.  You can also use 4
>> with a tag: URI if you don't want to support dereference.
>>
>>     -- Sandro
>>
>>
>
Received on Sunday, 27 March 2011 15:30:41 UTC