RE: [TF-ENT] RDFS entailment regime proposal from Seaborne, Andy on 2009-09-28 (public-rdf-dawg@w3.org from July to September 2009)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Mon, 28 Sep 2009 15:00:15 +0000
To: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
CC: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <B6CF1054FDC8B845BF93A6645D19BEA3693EA8D548@GVW1118EXC.americas.hpqcorp.net>


> -----Original Message-----
> From: b.glimm@googlemail.com [mailto:b.glimm@googlemail.com] On Behalf Of
> Birte Glimm
> Sent: 28 September 2009 15:01
> To: Seaborne, Andy
> Cc: SPARQL Working Group
> Subject: Re: [TF-ENT] RDFS entailment regime proposal
> 
> 2009/9/28 Seaborne, Andy <andy.seaborne@hp.com>:
> >> -----Original Message-----
> >> From: public-rdf-dawg-request@w3.org [mailto:public-rdf-dawg-
> >> request@w3.org] On Behalf Of Birte Glimm
> >> Sent: 24 September 2009 18:31
> >> To: SPARQL Working Group
> >> Subject: [TF-ENT] RDFS entailment regime proposal
> >>
> >> Hi all,
> >> whoever is interested in RDFS entailment: I would be very happy about
> >> comments and suggestions for the RDFS entailment regime as outlined
> >> in:
> >> http://wiki.webont.org/page/SPARQL/OWL

> >
> > The "Illegal Handling" says an error must be raised for illegal graph or
> query.
> >
> > I would very much like to leave this undefined, which includes the
> possibility of raising an error but leaves the mechanism up to the
> implementation.
> >
> > That is, illegal data or query puts a system outside the spec.  For
> example, if the graph is illegal (e.g. bad literal lexical form) but the
> query never touches that part of the graph, then the processor should be
> free to return something, and not be forces to raise an error which might
> require touching the whole graph to check it.
> >
> > This also arises in query optimization as a BGP might be solved repeated
> by substitution (index join style) from data elsewhere in the query and
> the error might only be raised on quite lat eon in query processing but
> the whole query is required to be an error by the spec.
> >
> >        Andy
> 
> Well, but under RDFS semantics you have to check consistency first
> anyway since an inconsistent graph entails all tuples. Bad lexical
> forms are not causing an inconsistency, only when combined with an
> assertion that the range of the used property/predicate is
> rdfs:Literal or rdf:XMLLiteral. Thus, if you parse a data set and find
> a literal that has a bad lexical form, you better check consistency
> anyway and after that you know whether your data is legal or not.
> Also, if a user asks
> SLEECT ?x WHERE { ?x <ex:b> <ex:c> . }
> I would expect an error because I wrote SEELCT instead of SELECT and I
> should be told that the query is not a legal query. Similarly
> SELECT ?x WHERE { ?x <ex:b> <ex:c> <ex:forthInATriple> . }
> should give me an error, right?

Yes it's a syntax error but I don't see how it connected.  It can be determined by a static determination from the query string.

Strictly, it's not a SPARQL query string and what a service does with that is outside the spec because the spec only defines what happens with query strings that match the grammar and says nothing about non-matching strings.  The SPARQL protocol error exists because the restriction is that it a SPARQL query string.

But in the RDFS entailment case it's the data at issue. For scalability, I like to see a processor that can process the query and get the answers be able to return them.  As proposed it's an error - it's not now outside the spec; it's covered by the spec and explicitly wrong.  But if a processor can perform a BGP matching without needing to touch the whole graph, then I think that should be allowed.  Similarly if it can start generating answers, then finds a problem, then a required error (and no results) means the processor can't stream and has to buffer all results before it sends any which is a potentially huge cost. 

The entailment doc does not specify what an error is - what had you in mind?  If it's going to relatively undefined, then we can just say that if the data is illegal, then all bets are off i.e. it's not matching for RDFS entailment if you get any answers.

I'm assuming "error" means like the errors we have in FILTER evaluation i.e. no answers at best or the notion of "error" in other systems where it means return an error code but no answers.  A situation where an error code and answers are returned is harder to design over HTTP and may have problems with streaming (the return code is sent before the body).

 Andy

> 
> I can see your point for simple entailment, but for RDFS entailment I
> would think that illegal data or query are best treated by an error.
> 
> Birte
> 
> 
> --
> Dr. Birte Glimm, Room 306
> Computing Laboratory
> Parks Road
> Oxford
> OX1 3QD
> United Kingdom
> +44 (0)1865 283529
Received on Monday, 28 September 2009 15:01:24 UTC