Re: Review of the Ent. Reg. Document by Markus from Birte Glimm on 2011-11-23 (public-rdf-dawg@w3.org from October to December 2011)

From: Birte Glimm <birte.glimm@uni-ulm.de>
Date: Wed, 23 Nov 2011 19:18:00 +0100
To: SPARQL Working Group <public-rdf-dawg@w3.org>
Cc: Markus Krötzsch <markus.kroetzsch@cs.ox.ac.uk>
Message-ID: <CABt65Oc-SDzrZw8s5UGR3KwMa_gym7NgHYY_0GVrBHF1xH5oqQ@mail.gmail.com>
Hi all,
I commented inline regarding the performed changes.
Birte

On 21 November 2011 22:14, Birte Glimm <birte.glimm@uni-ulm.de> wrote:
> Hi all,
>
> I attach the review from Markus below. I'll try and address most
> comments tomorrow.
>
> Birte
>
> The Entailment Regimes draft is in a very good shape. The explanations
> are helpful and detailed, layout and notation is used consistently,
> the structure is clear and systematic. Overall, I think this is a very
> clear and helpful document, also due to the many informative
> explanations. I have found only a few technical issues that may
> require some further clarification or minor correction. There are also
> a small number of (very) minor editorial issues that I am listing at
> the end.
>
> *** Inconsistency handling in RDFS
>
> I was puzzled by Section 4.1.1 where it is suggested that tools could
> avoid inconsistency checks (or do them lazily). The example is the one
> with the 10002 triples and the join in the query. Even if a tool would
> not report an inconsistency error, the inconsistency would still
> affect the semantics since all statements would now be entailed
> (restricted to the finitely many statements allowed by the
> conditions). So the join optimisation in the example seems to lead to
> incomplete results. I suppose incomplete answers are acceptable for
> conformant SPARQL implementations, but incomplete BGP matching could
> also lead to unsound query results for the overall query. So it does
> not seem to be advisable for a SPARQL implementation of the RDFS
> entailment regime to not check for inconsistencies, even if it is not
> required to report an error (which still has some advantages, as the
> later example with the HTTP-based query services illustrates).

This issue has been discussed intensively in the WG and it was decided
to not require a consostency check. In the case of RDF(S) the
inconsistency can only arise from malformed XML literals, which are
usually treated as "repaired" by tools that don't check consistency.
The WG decided that it is up to the implementors whether an up-front
consistency check is done or note.

> *** Primitive datatypes
>
> Section 5 explains what a datatype consists of and how literal values
> are assigned to a canonical form. However, the specification of the
> entailment regime (and all later regimes) then speaks of "primitive
> datatypes from which [some other datatype] is derived". The general
> datatype mechanism of RDF does not have a notion of "derived"
> datatypes and it is not clear what this means here. The goal of this
> formulation was to avoid the same value being reported as, e.g.,
> "1"^^xsd:byte and "1"^^xsd:int. Since RDF graph equivalence does not
> take datatypes into account, condition 4 of entailment regimes forces
> a form of normalisation. The difficulty is that canonical forms are
> defined for value when considered as a value of a particular datatype,
> but there is no mechanism to obtain a canonical datatype for a value
> (which may belong to many, possibly incomparable datatypes). I suggest
> to solve this by introducing the idea of such a canonical datatype
> upfront, at the place where the canonical values are now discussed. It
> can then be explained that the canonical types in XML Schema are the
> primitive types.
>
> With canonical datatypes for values and canonical lexical forms for
> values+datatypes, one can then define a canonical literal for each
> value.

Sec. 5 has now been extended to define canonical datatypes and
literals before specifying the regime. The new definitions are now
used in the specfication of the D-entailment regime.

> *** OWL RL URI
>
> Section 6.4 specifies a URI to be used in the service description by
> tools that support OWL RL. But Section 7.5.3 states that this URI
> "can" be used by "Endpoints that use the OWL 2 Direct Semantics
> entailment regime and that support the OWL 2 RL profile". It should be
> clarified if the use of this URI says anything about the semantics or
> not. If yes, then every profile URI would be needed in both semantics.
> If no, then the sentence in 7.5.3 should be changed to avoid this
> impression (and maybe another remark could be added how these URIs
> relate to the entailment regime). In general, I wondered what exactly
> these URIs mean, especially if it informs about the maximal supported
> fragment ("we support nothing that is not RL") or the minimal
> supported one ("we support at least all of RL").

7.5.3 means 7.5.4 I suppose (7.5.3 is about OWL QL) The profile IRIs
do not prescribe any semantics. The IRI for the semantics plus the
profile IRI indicates what kind of entailment checker is used to
answer the queries. This has been clarified in the ent. reg. spec now.
In particular, the profile descriptions have been moved to the
RDF-Based Semantics as this regime comes before the Direct Semantics
regime and also with the RDF-Based Semantics profile restrictions can
apply.

> *** Declarations for Direct Semantics
>
> From Sections 7.1 and 7.2, I did not fully understand where
> declarations can be given in a query. After reading the Appendix, I
> believe that they can be in the ontology that the active graph
> represents and in individual BGPs but not in imported ontologies and
> not in outer graph patterns.

Declarations can also be in imported ontologies of the active/scoping
graph. Queries are answered w.r.t. O(SG) where SG is the scoping graph
and in order to build O(SG) the OWL parsing process is used. Since the
parsing process will load all imported ontologies incl. the
declarations that are in the imported ontologies, all knowledge about
types that is in the queried ontology (which inludes the imports) can
be used (see also 7.1.1 and beginning of 7.1, typing is also discussed
expliciitly in 7.1.3).

Thus, mainly variables have to be typed unless also terms are used in
the query that do not occur in the queried ontology, which usually
does not make much sense.

> In particular, a query with a UNION needs
> to have declarations in each part, they cannot be given at a higher
> level of the query.

Yes.

> Also, variables can have different declarations in
> different BGPs. This might be worth a remark.

Added a remark at the end of 7.1.3.

> *** Variables in Literal Positions
>
> The remarks of Section 7.3.2 seem to apply to OWL RDF-Based Semantics
> just as well. This seems to be worth a remark (maybe the section
> should even be moved to the RDF-Based Semantics section since this is
> earlier in the document).

The problem is that in all examples that seem to work for the direct
semantics ananymous classes are used. For example, for the given
query, the RDF-Based Semantics would not derive the infinitely many
consequences. The same problem exists, however, for the RDF-Based
Semantics, but queries that work there seem to be outside of OWL 2 DL.
Thus, different examples are used (see 6.2  for the RDF-Based
Semantics), but I added remarks, that in both cases we have the same
problem that is addressed by (C2).

>
> *** Finite Answers in RIF
>
> It was not completely clear to me at first where the finiteness of
> results is coming from in Section 8.1. The proof in Appendix C claims
> that all regimes require that bindings are only taken from the
> vocabulary defined for this regime, but this is not really done (or
> needed) for RIF. There should probably be an according remark in
> Appendix C.

I added a remark for the RIF regime about the safety conditions and
linked to 8.3 where this is discussed.

>
> Minor editorial issues:
>
> * Introduction: ", or what kinds of errors can arise" -> ", and what
> kinds of errors can arise"
done
> * "aded" -> "added"
done
> * Sentences should not start with abbreviations, the spelled out forms
> should be used instead (cases that occur in the document are "E.g."
> and "I.e.").
done
> * The use of "a" to abbreviate "rdf:type" in the document is probably
> not helpful for a reader. The definitions and normative discussions
> require frequent use of "rdf:type" and there are many other rdf(s) and
> owl terms that have no such abbreviation. Writing one of them as "a"
> in some cases (but not in all) only introduces a source of confusion.
> A short remark about "a" being syntactic sugar might be useful, but I
> would eliminate it from the examples.

I personally think that if people are confused about that, they will
be confused about too many other things too. A while ago rdf:type was
replaced due to another reviewer's wish consistently by a, but
apparently some rdf:types slipped back in. Now they are all back to
rdf:type. Whoever reviews last has the say. I agree that consistency
is good and hope no a's slipped through (not exactly easy to search
for).

> * "imaginary IRI"; probably not the right word; maybe "exemplary"
> would be better; or maybe just fix the meaning to this concrete IRI
> right away
I use examplary now. Other specs seem to just pick a prefix without
saying anything about it, e.g., OWL uses a:foo without ever saying
anything about the prefix a. SPARQL Query just uses ex without
introduction.

> * Use hyphens for prefixes consistently: "re-naming" vs. "recaptured"
> (there might be other uses of "re-"; I guess "sub-" is another
> candidate to check)
done
> * Figure 1: the colour coding (RDF special terms vs. RDFS special
> terms) does not work in monochrome printout
colors changed to be b/w compatible
> * "Semantic Web" is not capitalised consistently
done
> * "Similarly, for OWL 2 DL entailment" should say "Direct Semantics"
> rather than "DL"
done
> * "Further explanation are"
done
> * "The term rdfV refers to ..." There and in all similar places, the
> word "term" is used in adjacent sentences to refer to RDF terms and to
> the denomination of a meta-level concept (a set of terms). A possible
> source of confusion.
rephrased (The set rdfV contains URI references of the ...)
> * "to a large extend" -> "extent"
done
> * Figure 2, caption should say "RDF graphs" (plural)
done
> * Notation of variables. As far as I know, SPARQL uses the syntactic
> forms "?x" and "$x" to denote the variable "x", i.e., the variable is
> "x" and not "?x". This is correctly implemented in most result tables
> but not in Section 8.4.2.2 and (all of) Section 9, where "?" appears
> in table headers. Moreover, Section 3.2 applies some solution mappings
> mu to "?x" rather than to "x" (but it is also correct in some places
> in that section). I did not notice it elsewhere but it might be worth
> checking.
done
> * The Editorial Note in Section 3.2 states that "ex:a ex:b ?x would
> have no solutions at all". This does not seem to be the right example
> since the related triple in the example graph does not use the rdf:_n
> entities that the note talks about.
Not sure I understand that. The note says that given
Data: ex:a ex:b rdf:_1 .
BGP: ?x rdf:type rdf:Property
currently yields 1 answers with binding rdf:_1 (but not rdf:_2, ...)
If rdf:_n is generally forbidden for bindings, the same BGP has no
answer and even
BGP: ex:a ex:b ?x .
over the data has no answer.
Shall I somehow rephrase the note?

> * "Some triples that are well-formed for OWL 2 DL, are" -> remove ","
done
> * Section 7.1.3, example 1 offers an interpretation of the query in
> terms of declarations. But since declarations cannot be queried in
> this entailment regime in any case this might be a misleading example
> (offering a possible interpretation for the query that no Direct
> Semantic query can ever have, even if ambiguous).
removed the declarations example, added instead object and data
property interpretations for the triple as probably most common
examples
> * "Higher Order Queries" vs. "First-Order Semantics". Probably have a
> hyphen in both cases.
done
> * Section 7.5, introduction, speaks about EL and QL only but the
> section covers all three profiles.
changed to also clarify the use of profile IRIs in Service Descriptions
> * Section 8: "more on this in 7.4" and "see 7.4" should both use "8.4".
done and also linked to the section now
> * Last sentence before Section 8.2: uses hyphens as dashes around
done
> "i.e. ..." Probably use commas or &ndash;
used commas (more consistent with the rest of the doc)
> * Same sentence, the "(1) - (3)" has spaces (other similar constructs
> do not have this); could also be &ndash; but this is really minor.
The XML XSLT processor does somehow not accept &ndash; as it is picky
with a lot of things, so only spaces removed
> * most uses of "cf." should probably be "see" or "see also" (when
> strictly adhering to common style guides)
done



>
>
> --
> Jun. Prof. Dr. Birte Glimm            Tel.:    +49 731 50 24125
> Inst. of Artificial Intelligence         Secr:  +49 731 50 24258
> University of Ulm                         Fax:   +49 731 50 24188
> D-89069 Ulm                               birte.glimm@uni-ulm.de
> Germany
>



-- 
Jun. Prof. Dr. Birte Glimm            Tel.:    +49 731 50 24125
Inst. of Artificial Intelligence         Secr:  +49 731 50 24258
University of Ulm                         Fax:   +49 731 50 24188
D-89069 Ulm                               birte.glimm@uni-ulm.de
Germany
Received on Wednesday, 23 November 2011 18:18:40 UTC