Re: Indicating Skolem Nodes (was Re: AW: {Disarmed} Re: blank nodes (once again)) from Sandro Hawke on 2011-03-26 (semantic-web@w3.org from March 2011)

From: Sandro Hawke <sandro@w3.org>
Date: Fri, 25 Mar 2011 23:27:05 -0400
To: Steve Harris <steve.harris@garlik.com>
Cc: semantic-web@w3.org
Message-ID: <1301110025.3138.2956.camel@waldron>
On Fri, 2011-03-25 at 18:10 +0000, Steve Harris wrote:
> On 2011-03-25, at 16:57, Sandro Hawke wrote:
> 
> > On Fri, 2011-03-25 at 16:01 +0000, Steve Harris wrote:
> >> On 2011-03-25, at 15:41, Pat Hayes wrote:
> >>> 
> >>> On Mar 25, 2011, at 10:05 AM, Sandro Hawke wrote:
> >>> 
> >>>> Thanks for the detailed answer, but I'm pretty sure you're answering a
> >>>> different question than I meant.   (Sorry for not being more clear.)
> >>>> What I meant was: is OWL 2 Full okay with people Skolemizing ontologies
> >>>> they are asserting?
> >>>> 
> >>>> I might be misunderstanding, but it seems like all the problems you
> >>>> point out only arise during the entailment check.  And yes, I know you
> >>>> can't Skolemize a query.   I would never even think about doing that.
> >>>> I'm just talking about Skolemizing assertions.
> >>>> 
> >>>> I think its general best to do queries in a query language and/or a rule
> >>>> language, but maybe that's a matter of taste.
> >>>> 
> >>>> You say, "you never know how someone will use your graph", so I guess
> >>>> the point is that Alice might publish an ontology that gets Skolemized
> >>>> by her system, and then Bob publishes an identical ontology, and then
> >>>> when Charlie comes along and wants to find out whether Bob and Alice's
> >>>> ontologies entail each other, he's going to get a false negative because
> >>>> of the Skolemization.
> >>> 
> >>> We can probably even fix this, in fact. If we can reliably distinguish 'bnode URIs' from other URis, eg if they all use a common namespace, then there is an obvious notion of graph equivalence which allows a 1:1 replacement of the skolem URIs.  And then Charlie can discover that, though not logically equivalent, Alice and Bob's graphs are graph-equivalent. People will write code to check things like this if it ever starts to matter to anyone. The cost of testing this is identical to the cost of checking graph equivalence right now (its the same algorithm.)
> >> 
> >> Exactly.
> >> 
> >> In triplestores I'm familiar (admittedly not that many) with bNodes are skolemised into a value space that's different from both literals and URIs, so this is a natural consequence.
> > 
> > So, is there a simple way we can flag them?   I know it's out of scope
> > for the RDF WG to define one, but maybe there's a solution that's so
> > simple everyone can just start doing it without a W3C process.
> 
> Yes, 4store uses either:
> 
> a) <bnode:b123456>, or
> b) <_:b123456>
> 
> neither is what I'd call legitimate though.
> 
> a) is an unregistered URI scheme, b) is not syntactically legal (URIs can't start with _), but has the advantage that you can take Turtle-syntax results, and stick <>s round it, which feels somehow appropriate.
> 
> These arose out of a pragmatic need to handle FOAF data found in the wild, which is riddled with bNodes. 
> 
> Of your strawmen below, I prefer 1, and would sugest bnode:. Being pragmatic about it, regardless of whether the URI scheme is official, if people like it, it will spread quickly. Like tag: infact.

True.

So, refined:

Strawman 1:

 - new URI scheme, "bnode", followed by text that wont accidentally be
shared.  Specifically, we refer to RFC 4122 and 4151, saying a bnode: is
followed by either a urn:uuid: (without the "urn:uuid:") or a tag:
(without the "tag:").  That lets you make it human readable if you want.

Hm.

I wouldn't object to this -- it's the cleanest engineering, I think --
but I worry about not being able to dereference, so I still prefer the
magic string approach.   I realize that's a bit of a hack, though.  But
it's a hack that I think would work just fine ... so that makes it
elegant, right?  :-)

    -- Sandro

> I know that some other triplestores also do b), but I'm not going to name and shame them here :)
> 
> - Steve
> 
> >        Strawman 1: make new URI scheme for this
> > 
> > Con: very hard to do. (It took me and Tim Kingberg 4+ years to get the
> > "tag:" URI scheme RFC published.  Hopefully it's gotten much easier, but
> > still I'm hesitant.)
> > Con: it wouldn't be a link for linked data
> > 
> >        Strawman 2: use urn:uuid:<uuid>
> > 
> > Con: there might be some false-positives, because of people using UUIDs
> > who don't mean them like this
> > Con: might be longer than necessary
> > Con: no helpful human-readable element
> > Con: no link for linked data
> > 
> >        Strawman 3: use tag:w3.org,2000:Skolem:<some optional
> >        text>:<uuid>
> > 
> > Con: no link for linked data
> > Con: might be longer than necessary
> > 
> >        Strawman 4: use any IRI with some magic string in it, like
> >        "SkBNode" or "$+SKNB+$".
> > 
> > Con: some false positives, as magic string may appear in a few IRIs
> > where it was not intended (such as blog posts about the concept, which
> > use it in the title, or other naive machine generated URLs).
> > 
> > For me, the clear winner is Strawman 4, because I really like being able
> > to dereference stuff, even if it's a Skolem constant.  This allows the
> > Skolemizer to provide web service if it wants to.  You can also use 4
> > with a tag: URI if you don't want to support dereference.
> > 
> >    -- Sandro
> > 
> > 
>
Received on Saturday, 26 March 2011 03:27:14 UTC