Re: Review of the Ent. Reg. Document by Markus from Markus Krötzsch on 2011-11-25 (public-rdf-dawg@w3.org from October to December 2011)

From: Markus Krötzsch <markus.kroetzsch@cs.ox.ac.uk>
Date: Fri, 25 Nov 2011 08:52:06 +0000
To: birte.glimm@uni-ulm.de
CC: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <4ECF5736.4080205@cs.ox.ac.uk>
Dear Birte, dear WG,

thank you for the revisions. I think that all of my comments have been 
addressed appropriately.

There was one open question about my minor comment on an Editorial Note. 
I had misread the note to refer to the earlier example in the main text, 
not to the one in the note. It is clear now.

Regards,

Markus

-- 
Dr. Markus Krötzsch
Department of Computer Science, University of Oxford
Room 306, Parks Road, OX1 3QD Oxford, United Kingdom
+44 (0)1865 283529               http://korrekt.org/


On 23/11/11 18:18, Birte Glimm wrote:
> Hi all,
> I commented inline regarding the performed changes.
> Birte
>
> On 21 November 2011 22:14, Birte Glimm<birte.glimm@uni-ulm.de>  wrote:
>> Hi all,
>>
>> I attach the review from Markus below. I'll try and address most
>> comments tomorrow.
>>
>> Birte
>>
>> The Entailment Regimes draft is in a very good shape. The explanations
>> are helpful and detailed, layout and notation is used consistently,
>> the structure is clear and systematic. Overall, I think this is a very
>> clear and helpful document, also due to the many informative
>> explanations. I have found only a few technical issues that may
>> require some further clarification or minor correction. There are also
>> a small number of (very) minor editorial issues that I am listing at
>> the end.
>>
>> *** Inconsistency handling in RDFS
>>
>> I was puzzled by Section 4.1.1 where it is suggested that tools could
>> avoid inconsistency checks (or do them lazily). The example is the one
>> with the 10002 triples and the join in the query. Even if a tool would
>> not report an inconsistency error, the inconsistency would still
>> affect the semantics since all statements would now be entailed
>> (restricted to the finitely many statements allowed by the
>> conditions). So the join optimisation in the example seems to lead to
>> incomplete results. I suppose incomplete answers are acceptable for
>> conformant SPARQL implementations, but incomplete BGP matching could
>> also lead to unsound query results for the overall query. So it does
>> not seem to be advisable for a SPARQL implementation of the RDFS
>> entailment regime to not check for inconsistencies, even if it is not
>> required to report an error (which still has some advantages, as the
>> later example with the HTTP-based query services illustrates).
>
> This issue has been discussed intensively in the WG and it was decided
> to not require a consostency check. In the case of RDF(S) the
> inconsistency can only arise from malformed XML literals, which are
> usually treated as "repaired" by tools that don't check consistency.
> The WG decided that it is up to the implementors whether an up-front
> consistency check is done or note.
>
>> *** Primitive datatypes
>>
>> Section 5 explains what a datatype consists of and how literal values
>> are assigned to a canonical form. However, the specification of the
>> entailment regime (and all later regimes) then speaks of "primitive
>> datatypes from which [some other datatype] is derived". The general
>> datatype mechanism of RDF does not have a notion of "derived"
>> datatypes and it is not clear what this means here. The goal of this
>> formulation was to avoid the same value being reported as, e.g.,
>> "1"^^xsd:byte and "1"^^xsd:int. Since RDF graph equivalence does not
>> take datatypes into account, condition 4 of entailment regimes forces
>> a form of normalisation. The difficulty is that canonical forms are
>> defined for value when considered as a value of a particular datatype,
>> but there is no mechanism to obtain a canonical datatype for a value
>> (which may belong to many, possibly incomparable datatypes). I suggest
>> to solve this by introducing the idea of such a canonical datatype
>> upfront, at the place where the canonical values are now discussed. It
>> can then be explained that the canonical types in XML Schema are the
>> primitive types.
>>
>> With canonical datatypes for values and canonical lexical forms for
>> values+datatypes, one can then define a canonical literal for each
>> value.
>
> Sec. 5 has now been extended to define canonical datatypes and
> literals before specifying the regime. The new definitions are now
> used in the specfication of the D-entailment regime.
>
>> *** OWL RL URI
>>
>> Section 6.4 specifies a URI to be used in the service description by
>> tools that support OWL RL. But Section 7.5.3 states that this URI
>> "can" be used by "Endpoints that use the OWL 2 Direct Semantics
>> entailment regime and that support the OWL 2 RL profile". It should be
>> clarified if the use of this URI says anything about the semantics or
>> not. If yes, then every profile URI would be needed in both semantics.
>> If no, then the sentence in 7.5.3 should be changed to avoid this
>> impression (and maybe another remark could be added how these URIs
>> relate to the entailment regime). In general, I wondered what exactly
>> these URIs mean, especially if it informs about the maximal supported
>> fragment ("we support nothing that is not RL") or the minimal
>> supported one ("we support at least all of RL").
>
> 7.5.3 means 7.5.4 I suppose (7.5.3 is about OWL QL) The profile IRIs
> do not prescribe any semantics. The IRI for the semantics plus the
> profile IRI indicates what kind of entailment checker is used to
> answer the queries. This has been clarified in the ent. reg. spec now.
> In particular, the profile descriptions have been moved to the
> RDF-Based Semantics as this regime comes before the Direct Semantics
> regime and also with the RDF-Based Semantics profile restrictions can
> apply.
>
>> *** Declarations for Direct Semantics
>>
>>  From Sections 7.1 and 7.2, I did not fully understand where
>> declarations can be given in a query. After reading the Appendix, I
>> believe that they can be in the ontology that the active graph
>> represents and in individual BGPs but not in imported ontologies and
>> not in outer graph patterns.
>
> Declarations can also be in imported ontologies of the active/scoping
> graph. Queries are answered w.r.t. O(SG) where SG is the scoping graph
> and in order to build O(SG) the OWL parsing process is used. Since the
> parsing process will load all imported ontologies incl. the
> declarations that are in the imported ontologies, all knowledge about
> types that is in the queried ontology (which inludes the imports) can
> be used (see also 7.1.1 and beginning of 7.1, typing is also discussed
> expliciitly in 7.1.3).
>
> Thus, mainly variables have to be typed unless also terms are used in
> the query that do not occur in the queried ontology, which usually
> does not make much sense.
>
>> In particular, a query with a UNION needs
>> to have declarations in each part, they cannot be given at a higher
>> level of the query.
>
> Yes.
>
>> Also, variables can have different declarations in
>> different BGPs. This might be worth a remark.
>
> Added a remark at the end of 7.1.3.
>
>> *** Variables in Literal Positions
>>
>> The remarks of Section 7.3.2 seem to apply to OWL RDF-Based Semantics
>> just as well. This seems to be worth a remark (maybe the section
>> should even be moved to the RDF-Based Semantics section since this is
>> earlier in the document).
>
> The problem is that in all examples that seem to work for the direct
> semantics ananymous classes are used. For example, for the given
> query, the RDF-Based Semantics would not derive the infinitely many
> consequences. The same problem exists, however, for the RDF-Based
> Semantics, but queries that work there seem to be outside of OWL 2 DL.
> Thus, different examples are used (see 6.2  for the RDF-Based
> Semantics), but I added remarks, that in both cases we have the same
> problem that is addressed by (C2).
>
>>
>> *** Finite Answers in RIF
>>
>> It was not completely clear to me at first where the finiteness of
>> results is coming from in Section 8.1. The proof in Appendix C claims
>> that all regimes require that bindings are only taken from the
>> vocabulary defined for this regime, but this is not really done (or
>> needed) for RIF. There should probably be an according remark in
>> Appendix C.
>
> I added a remark for the RIF regime about the safety conditions and
> linked to 8.3 where this is discussed.
>
>>
>> Minor editorial issues:
>>
>> * Introduction: ", or what kinds of errors can arise" ->  ", and what
>> kinds of errors can arise"
> done
>> * "aded" ->  "added"
> done
>> * Sentences should not start with abbreviations, the spelled out forms
>> should be used instead (cases that occur in the document are "E.g."
>> and "I.e.").
> done
>> * The use of "a" to abbreviate "rdf:type" in the document is probably
>> not helpful for a reader. The definitions and normative discussions
>> require frequent use of "rdf:type" and there are many other rdf(s) and
>> owl terms that have no such abbreviation. Writing one of them as "a"
>> in some cases (but not in all) only introduces a source of confusion.
>> A short remark about "a" being syntactic sugar might be useful, but I
>> would eliminate it from the examples.
>
> I personally think that if people are confused about that, they will
> be confused about too many other things too. A while ago rdf:type was
> replaced due to another reviewer's wish consistently by a, but
> apparently some rdf:types slipped back in. Now they are all back to
> rdf:type. Whoever reviews last has the say. I agree that consistency
> is good and hope no a's slipped through (not exactly easy to search
> for).
>
>> * "imaginary IRI"; probably not the right word; maybe "exemplary"
>> would be better; or maybe just fix the meaning to this concrete IRI
>> right away
> I use examplary now. Other specs seem to just pick a prefix without
> saying anything about it, e.g., OWL uses a:foo without ever saying
> anything about the prefix a. SPARQL Query just uses ex without
> introduction.
>
>> * Use hyphens for prefixes consistently: "re-naming" vs. "recaptured"
>> (there might be other uses of "re-"; I guess "sub-" is another
>> candidate to check)
> done
>> * Figure 1: the colour coding (RDF special terms vs. RDFS special
>> terms) does not work in monochrome printout
> colors changed to be b/w compatible
>> * "Semantic Web" is not capitalised consistently
> done
>> * "Similarly, for OWL 2 DL entailment" should say "Direct Semantics"
>> rather than "DL"
> done
>> * "Further explanation are"
> done
>> * "The term rdfV refers to ..." There and in all similar places, the
>> word "term" is used in adjacent sentences to refer to RDF terms and to
>> the denomination of a meta-level concept (a set of terms). A possible
>> source of confusion.
> rephrased (The set rdfV contains URI references of the ...)
>> * "to a large extend" ->  "extent"
> done
>> * Figure 2, caption should say "RDF graphs" (plural)
> done
>> * Notation of variables. As far as I know, SPARQL uses the syntactic
>> forms "?x" and "$x" to denote the variable "x", i.e., the variable is
>> "x" and not "?x". This is correctly implemented in most result tables
>> but not in Section 8.4.2.2 and (all of) Section 9, where "?" appears
>> in table headers. Moreover, Section 3.2 applies some solution mappings
>> mu to "?x" rather than to "x" (but it is also correct in some places
>> in that section). I did not notice it elsewhere but it might be worth
>> checking.
> done
>> * The Editorial Note in Section 3.2 states that "ex:a ex:b ?x would
>> have no solutions at all". This does not seem to be the right example
>> since the related triple in the example graph does not use the rdf:_n
>> entities that the note talks about.
> Not sure I understand that. The note says that given
> Data: ex:a ex:b rdf:_1 .
> BGP: ?x rdf:type rdf:Property
> currently yields 1 answers with binding rdf:_1 (but not rdf:_2, ...)
> If rdf:_n is generally forbidden for bindings, the same BGP has no
> answer and even
> BGP: ex:a ex:b ?x .
> over the data has no answer.
> Shall I somehow rephrase the note?
>
>> * "Some triples that are well-formed for OWL 2 DL, are" ->  remove ","
> done
>> * Section 7.1.3, example 1 offers an interpretation of the query in
>> terms of declarations. But since declarations cannot be queried in
>> this entailment regime in any case this might be a misleading example
>> (offering a possible interpretation for the query that no Direct
>> Semantic query can ever have, even if ambiguous).
> removed the declarations example, added instead object and data
> property interpretations for the triple as probably most common
> examples
>> * "Higher Order Queries" vs. "First-Order Semantics". Probably have a
>> hyphen in both cases.
> done
>> * Section 7.5, introduction, speaks about EL and QL only but the
>> section covers all three profiles.
> changed to also clarify the use of profile IRIs in Service Descriptions
>> * Section 8: "more on this in 7.4" and "see 7.4" should both use "8.4".
> done and also linked to the section now
>> * Last sentence before Section 8.2: uses hyphens as dashes around
> done
>> "i.e. ..." Probably use commas or&ndash;
> used commas (more consistent with the rest of the doc)
>> * Same sentence, the "(1) - (3)" has spaces (other similar constructs
>> do not have this); could also be&ndash; but this is really minor.
> The XML XSLT processor does somehow not accept&ndash; as it is picky
> with a lot of things, so only spaces removed
>> * most uses of "cf." should probably be "see" or "see also" (when
>> strictly adhering to common style guides)
> done
>
>
>
>>
>>
>> --
>> Jun. Prof. Dr. Birte Glimm            Tel.:    +49 731 50 24125
>> Inst. of Artificial Intelligence         Secr:  +49 731 50 24258
>> University of Ulm                         Fax:   +49 731 50 24188
>> D-89069 Ulm                               birte.glimm@uni-ulm.de
>> Germany
>>
>
>
>
Received on Friday, 25 November 2011 08:52:32 UTC