Re: One semi-historical point (was Re: ISSUE: DISTINCT is underspecified)

On Aug 19, 2006, at 3:11 AM, Pat Hayes wrote:

>> On Aug 17, 2006, at 8:34 AM, Pat Hayes wrote:
>> [snip]
>>
>>> Well, this is an aside to our main discussion here, but I think  
>>> that it would be quite acceptable to have an RDF query standard  
>>> which was defined *entirely* syntactically, and simply treated an  
>>> RDF graph as a triple store, and used essentially algebraic  
>>> operations to troll through it for patterns that match and  
>>> satisfy superficial conditions (which might include semantic  
>>> conditions if those can be computed locally, eg typed literal  
>>> values). This was basically the design we had originally, about  
>>> which so many protests were received wanting a more 'semantic'  
>>> account.
>>
>> I think is a point of misunderstanding. If y'all said, "We're  
>> defining the Query Of RDF Syntax (QORS) language, and if you want  
>> to do simple, or rdf, or rdfs, with or without D, or even OWL,  
>> well, you're out of luck, make a new standard and you might  
>> consider borrowing our query  language syntax" I would not have  
>> objected.
>
> But I don't think you *are* out of luck for RDF/S/D.

You definitely are in the sense that there a decisions that have to  
be made about how to handle queries of, e.g., inconsistent graphs, etc.

> OWL... well yes, OWL is more complicated because it has  
> disjunctions in it.

I can amend what I would have been happy with y'all saying:

"""If y'all had said, "We're defining the Query Of RDF Syntax (QORS)  
language, and if you want to do simple, or rdf, or rdfs, with or  
without D, you can get most of the joy by considering fixed point  
extensions of simply interpreted graphs, plus dealing with some  
corner cases. However, that will really suck for OWL. In either case,  
there are decisions you could make that would result in non- 
interoperable implementations."""

>> Or at least not in the same way. I'm fine without a unitary  
>> semantic framework per se. Just let me *say* in my own way, what  
>> the semantics are (ok, some framework would be nice, just to make  
>> them easier to read; but I'm fine with an outlier), and give me  
>> good hooks to indicate when I'm using it (and the hooks should be  
>> in the query language please; not all sparql query is across the  
>> web in any interesting way).
>>
>> What I objected to was 1) a syntactic reading that in no way tied  
>> to the RDF Semantics document
>
> It was tied to it, in the sense that the subgraph/match design  
> corresponds directly to simple entailment.

Think a more active reading of "tied". I.e., I don't think there are  
no connections, but those connections were not *made* in the  
document, nor in most of the discussions. Hence the idea of a tie  
between "a reading" (given concrete form in the sparql document) and  
"document".

Look, when I come back to the SPARQL document after a while, I,  
someone who's spent a lot of time inside the RDF Semantics document,  
find things difficult to mesh. Not all the choices going from RDF to  
SPARQL are obvious, and not all have obvious answers (at least in the  
sense than arbitrary sensible people will all agree).

> And the entailment lemmas then give you at least a spec for RDF and  
> RDFS, if not a very good implementation strategy off the cuff, as  
> it were. I agree, the XSD case needs more work.
>
>> , and 2) the claim that this would require no change in order to  
>> work for simple, rdf, rdfs, and even OWL queries.
>
> I hope nobody made that last claim about OWL. I certainly did not.

Let's even amend it to "little change". It certainly was the  
impression many of us got. My first exposure was at the F2F in Boston:
	http://www.w3.org/2001/sw/DataAccess/ftf5-bos.html

Which incorrectly only notes your regrets and not your remote  
participation. Oh geez, I see that this document presents me as  
saying something unsufferably stupid:

"""ACTION KendallC: to incorporate service description discussion  
notes in protocol spec
Discussion resumed after a break, when Bijan re-joined us. We  
fidgeted to get the whiteboard near the phone and such...

Bijan explained how in OWL, sometimes the "deductive closure" isn't  
well-defined, especially in cases involving disjunction. We discussed  
an example: :bob :loves [ a [ owl:unionOf (:Students :Faculty) ] ]"""

Deductive closure certainly is well-defined, even in cases involving  
disjunction. People were, IIRC, using the term "Graph closure", and I  
think that *can* have different readings.

http://lists.w3.org/Archives/Public/public-rdf-dawg/2005JulSep/0478.html

Hmm. Here I am presenting how to do deductive closure with OWL blah  
blah. I thought I got that *from you* in the Tech Plenary session.

Since you seem to be have been convinced by an Enrico example (little  
house), that at least suggests that you thought that virtual graphs  
presented no inherent specification difficulties when extended to  
OWL. I feel sure that people felt that way. I had to argue quite a  
bit at the Tech Plenary to show the group otherwise.

See also:

	http://lists.w3.org/Archives/Public/public-rdf-dawg/2005JulSep/ 
0496.html

Oh, and:

	http://lists.w3.org/Archives/Public/public-rdf-dawg/2005JulSep/ 
0498.html

In which I use the terminology "distinguished" variable. So even in  
this group, it was used before.
>> This is manifestly not true. There are people in the working group  
>> who support RDFS, and at the very least you have to say something  
>> about contradictory documents.
>
> True, but you can follow what the RDF semantics document says,  
> which refers to the concept of  'datatype clash'.

How does that help?

"""Datatype clashes are the only inconsistencies recognized by this  
model theory; note however that datatype clashes involving XML  
literals can arise in RDFS."""

I thought I *was* following what the RDF semantics document says.  
Traditionally, inconsistent databases have no meaningful answers,  
hence the rise of so much work on paraconsistency in the database/ 
logic programming area.

> The point being that quite a lot of this was already worked out in  
> the RDF WG and is documented (with some known bugs, which are now  
> fixed by others), and we *could* have simply directly used this  
> stuff, with minimal change.

If you read my email above (<http://lists.w3.org/Archives/Public/ 
public-rdf-dawg/2005JulSep/0478.html>) you'll see me write:
"""> 2. Right now the basic operation is described as follows. A
 > substitution B is a mapping from variables of a pattern Q to URIrefs,
 > literals or bnodes, and an answer is a binding with B(Q) a  
subgraph of
 > KB. There seems to be a strongly and widely held opinion that it  
would
 > be less confusing, or at least helpful to many folk, to describe this
 > in terms of simple entailment

Almost, I would like the wording to be "entails" where where the
entailment relation may vary. I think the document to explicitly list
simple with told-bnode reduncency, simple, and RDF entailment. I think
it would be nice if it listed RDFS and the various owl entailments.
Providing URIs for these would be excellent!"""

So that's where you claimed, like now, that we thought it would be  
clearer, but I thought it was more important to *define* them all and  
get it right. That's the overriding point. And to do that you have to  
look at the semantics of each relation.

> But it wouldn't extend this simply to OWL-DL, indeed.
>
>> Even if you go with maximal consistent subsets, that still needs  
>> to be said, explained, etc.
>
> True, true, there is work to be done. But it still would make for a  
> much easier basic design (and one which is tied closely to the  
> extant implementations) and a much simpler description. Issues of  
> how to describe binding restrictions are much simpler when the  
> notion of pattern matching is built into the primary definitions,  
> for example.
>
>> So my problem then is the same as my problem now: Lots of things  
>> are unspecified or underspecified. Some of the offered ways of  
>> specifying just would work very well if at all.
>>
>>>>  And I think we should make the semantics available. (Now, of  
>>>> course, we're disagreeing on what the semantics require. Let me  
>>>> weaken my principle to say that it should help people understand  
>>>> the semantics of the graph.)
>>>
>>> OK, Im quite happy with that reading. But I still think that its  
>>> important to not suppress answers which can be used to extract  
>>> *semantically* distinct information entailed by the graph. I  
>>> guess my point is that it is the semantics of the *graph*, not of  
>>> the *answers*, that likely matter most to a querying agent.
>>
>> Well, we disagree. Or at least, I think focusing the semantics of  
>> the answers are:
>> 	1) important
>> 	2) reasonable
>> 	3) easier to specify, understand, and implement
>>
>> This doesn't mean I feel a need to kick yours out, but if there's  
>> only one, yeah, this is the one I'll support.
>
> Fair enough that we disagree. However, I stick to my point. IMO,  
> the answers are primarily a way to extract information from the  
> graph. It's the graph that is being queried.

I don't see how that settles anything. Suppose there is a graph:
	_:x p y.
	_:x p y.

There's "information" in the graph, namely that this was asserted  
twice. Heck, I bet there are some implementations which have in their  
internal graph format:
	x p y.
	x p y.

Certainly could happen in an RDF/XML document. We don't treat that  
information seriously because it's not information that is  
distinguishable by the semantics of RDF. I will discuss this more in  
another post.

Cheers,
Bijan

Received on Saturday, 19 August 2006 17:13:11 UTC