Re: Review RDF 1.1 Semantics (ED 3rd June 2013) from Peter Patel-Schneider on 2013-06-12 (public-rdf-wg@w3.org from June 2013)

From: Peter Patel-Schneider <pfpschneider@gmail.com>
Date: Wed, 12 Jun 2013 09:04:08 -0700
To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Cc: Pat Hayes <phayes@ihmc.us>, RDF WG <public-rdf-wg@w3.org>
Message-ID: <CAMpDgVx5Nc53XGiO2FkrS2MSkMJ+paWTHyWUyREBwT4GpmovRw@mail.gmail.com>
This review by Antoine will require WG discussion.

peter


On Wed, Jun 12, 2013 at 3:53 AM, Antoine Zimmermann
<antoine.zimmermann@emse.fr> wrote:
> Pat, Peter,
>
>
> This is my review of RDF 1.1 Semantics. Sorry for sending this so late.
> On the plus side, I'd say that overall, the presentation have been much
> improved, interpretations being independent from a vocabulary is a big
> bonus, making D-interpretations independent from the RDF vocabulary is also
> much better. Putting the rules in context with the corresponding entailment
> regime is also good.
>
> Now, for the main criticism, I have two outstanding problems with the
> current version:
>  1. D-entailment using a set rather than a mapping;
>  2. Define entailment of a set as entailment of the union.
>
>
> 1. D-entailment
> ===============
>
> Concerning 1, the implication of the new definition is that given a D, it
> not generally possible to know what are the valid D-entailments.
>
> For instance, consider D = {http://example.com/dt}. What does the triple:
>
>  <s> <p> "abc"^^<http://example.com/dt> .
>
> D-entails? The specification does not say.
> Moreover, because of the absence of a known mapping from IRIs to datatypes,
> there are a few ill-defined conditions:
> For instance, in Section 9, the table "Semantic conditions for datatyped
> literals" says:
>
> """
> For every other IRI aaa in D, and every literal "sss"^^aaa,
> IL("sss"^^aaa)=L2V(I(aaa))(sss)
> """
>
> L2V is only defined for datatypes, whereas I(aaa) is not constrained to be a
> datatype. Even though it was constrained to be a datatype, this would not
> define the value of IL("sss"^^aaa), unless aaa is one of the normative XSD
> datatype IRIs.
>
> In any case, no matter how you tweak the definitions, the application MUST
> have a mapping from the set of "recognised" datatype IRIs to some specific
> datatypes.
>
> Later in Section 10, it says that "if D < E and S E-entails G then S
> D-entails G." Since no constraints are given on how to interpret
> "recognised" non-XSD datatype IRIs, it is possible that the same IRI in
> D-entailment is interpreted differently in E-entailment.
>
> In Section 11, table "RDF semantic conditions:
>
> """
> For every IRI aaa in D, <x,I(aaa)> is in IEXT(I(rdf:type)) if and only if x
> is in the value space of I(aaa)
> """
>
> This is ill-defined because the value space of I(aaa) may not exist. Again,
> even if I(aaa) is constrained to be a datatype, how do we know what is its
> value space? Therefore, the condition cannot be verified in general.
>
> Finally, the reasons why this change has been made are unclear. The working
> group was not chartered to do anything about that, the workshop in 2010 did
> not point at all to any problems with datatype maps, this Working Group did
> not discuss or complained about the D being a map when the change was made.
> No prior discussions were attempted before making the change.
>
> Implementations that rely on custom datatypes are interpreting the custom
> datatype IRIs according to one specific, known datatype, therefore, they do
> have a datatype *map* implemented. There is zero motivation to make such a
> change.
>
>
>
> 2. using union
> ==============
> This issue is different from the previous one because it does not make the
> definitions and propositions incorrect.
>
> I see two problems with the new definition: first, it makes the notion of
> entailment in RDF different from the standard, universally accepted notion
> of entailment in logic. In general, no matter what semantics is considered
> entailment is defined as follows:
>
> """
> A set S of formulas in the language entails a formula F in the same language
> if and only if all interpretations that satisfy all the formulas of S also
> satisfy F.
> """
>
> That's what was in RDF 2004, that's what's in OWL, that's what's in any
> logic with a model-theory.
>
> There are also inconvenient consequences for manipulation of RDF graphs:
> how is it supposed to be implemented? Assume we have two representations of
> two graphs. How do you know what's the union of the two graphs? You do not
> have access to the bnodes, only to identifiers or locations in files or in
> memory. There is a rule of thumb saying "different documents, different
> bnodes". And what about RDF graphs in an in-memory model? What about two
> examples of RDF graphs in Turtle in a written article? They are in the same
> document, they certainly share bnodes, right?
>
> Now if we take the simple case when the application is able to determine
> that the bnodes are disjoint, how can it perform a union? The answer is that
> it must *separate apart* the bnode identifiers. So, while in 2004 there was
> a coherence between the way merge was defined and the way it has to be
> implemented, now there is a discreprency between the definition and the
> pratice.
>
> Then, once the separation apart is made to produce a representation of the
> union, the created graph is, by definition of union, sharing bnodes with the
> two original graphs. But how can the overlap of bnodes be recognised in and
> out of the application? One would need meta-information about the
> relationship between the graphs. And how to represent and store that
> relationship?
>
> Also, if one wants to keep two graphs that share bnodes separate (say, they
> are distinct graphs in the same TriG files). Then these graphs cannot be
> stored separately if one wants to retain equivalent inferences on the set of
> graphs. That is, if I have {G1,G2} such that G1 and G2 share some bnodes,
> storing G1 apart would create a "copy" of G1 with disjoint bnodes. The new
> graph, H1, would be equivalent to G1, but the set {H1,G2} would not yield
> the same entailments as {G1,G2}.
>
> Finally, the decision to replace merge with union was first put into the
> document without prior discussion with the Working Group, without evidence
> that it follows practices, without evidence that it solves known issues. The
> notion of merge was not identified as a subject of concern during the W3C
> workshopin 2010. Implementations do implement the RDF 2004 correctly.
>
>
>
> Conclusion:
> ===========
> More generally, any change like this is disturbing education. If this design
> is standardised at the end of the year, there will be a gap between what's
> in the standard and what has been written for years in tutorials, courses,
> research papers, and so on.
>
> Considering that I see no added value compared to 2004 from both these
> changes, and having even identified flaws, I oppose publication of RDF 1.1
> Semnatics in such a state.
>
> Note that the solution I propose is simple and simpler than what is
> proposed: to go back to the old design concerning entailment of a set of
> graphs and datatype map. My proposal is also less likely to trigger
> unsupportive comments in the Last Call phase. We cannot aford to spend more
> time in inventing new design.
>
>
>
> Minor remarks:
> ==============
> I think there are too many sections. Simple interpretations and simple
> entailment can be subsections of a common section. The same for
> D-interpretations and D-entailment.  Same for RDF interpretations and RDF
> entailment; same for RDFS.
>
> Section 3:
> """For example, RDF statements of the form:
>
>  ex:a  rdfs:subClassOf  owl:Thing .
>
> are forbidden in the OWL-DL [OWL2-SYNTAX] semantic extension."""
>
>  -> This triple can be a valid part of an OWL 2 DL ontology. A better
> example would be:
>
>  ex:a  rdfs:subClassOf  "Thing" .
>
> Moreover, perhaps a reference to OWL 2 mapping to RDF graphs [1] would be
> better, since [OWL2-SYNTAX] defines OWL 2 ontologies in terms of a
> functional syntax that does not say anything about the constrains in the RDF
> serialisation.
>
> Section 4:
> "A typed literal contains two names" -> We do not have the notion of typed
> literals since all literals are typed.
> "Two graphs are isomorphic when each maps into the other by a 1:1 mapping on
> blank nodes." -> this is very much underspecified. There are other
> constraints on isomorphisms.
> "Graphs share blank nodes ... of distinct blank nodes." -> this discussion
> should not be here. In fact, it should rather appear in Concepts. In any
> case, it does not belong to notation and terminology.
> "This document will often treat a set of graphs as being identical to a
> single graph comprising their union, without further comments." -> if my
> concerns above are taken into account, this should be removed. A definition
> of merge should be added instead. By the way, I haven't seen many sets of
> graph being treated as a single graph. Actually, I think I only saw it
> twice. So we cannot say "often".
>
> Section 5:
> Make it a subsection of "Simple semantics"? "Simple entailment"?
> "a function from expressions (names, triples and graphs) to semantic
> values:" -> what's a "semantic value"?
> "triple s p o then ..." -> why not "triple (s, p, o)" ?
> Same remark in item 4 of Section 5.2
>
> Section 6:
> Make it a subsection of "Simple semantics"? "Simple entailment"?
> "a graph G simply entails a graph E when every interpretation which
> satisfies G also satisfies E, and a set S of graphs simply entails E when
> the union of S simply entails E" -> change this to "a set S of graphs simply
> entails a graph E when every interpretation which satisfies all graphs in S
> also satisfies E"
> Remove the Change Note.
>
> Section 6.1:
> "the inference from (P and Q) to P, and the inference from foo(baz) to
> (exists (x) foo(x))." -> the notation "(P and Q)" etc is rather obscure in
> this context. Perhaps it would be good to present the usual First Order
> Logic translation of the semantics. BTW, the usual FOL translation would not
> be valid for entailments over a set of graphs because {FOL(G1),FOL(G2)} is
> equivalent to FOL(merge(G1,G2)).
> The example with ex:a ex:p _:x is confusing RDF graphs and RDF documents, as
> well as bnodes and bnode identifiers. Then, while the naive readers would
> intuitively imagine that taking the union of the two triples would simply
> amount to putting them together, they realise that they have to "standardise
> apart" the bnode identifiers.
>
> Section 7:
> "For any graph H, if sk(G) entails H then there is a graph H' such that G
> entails H' and H=sk(H')" -> this should rather be: "For any graph H, if
> sk(G) entails H then there is a skolemization sk'(H) of H such that G
> entails sk'(H)"
>
> Section 8:
> Remove the second change note, as per my concerns above.
> "datatype d refers to (denotes) the value" -> why not just say "denotes"
> "L2V(d)(string)" -> rather, L2V(d)(sss)
> "rdf:plainLiteral" -> "rdf:PlainLiteral"
> "the datatype it refers to MUST be specified unambiguously" -> yes, there
> MUST be a mapping from datatype IRIs to datatypes, i.e., there must be a
> datatype map. This is a MUST, why doesn't it appear as a constraint in the
> formal semantics?
>
> Section 9:
> Make it a subsection of "D-semantics"? "D-entailment"?
>
> Section 10:
> Make it a subsection of "D-semantics"? "D-entailment"?
> "a set S of graphs (simply) D-entails or entails recognizing D a graph G
> when every D-interpretation which makes S true also D-satisfies G." -> "a
> set S of graphs (simply) D-entails a graph G when every D-interpretation
> which satisfies all graphs in S also D-satisfies G."
>
> Section 10.1:
> why not put the general rule for datatype entailment:
> """
> aaa uuu "xxx"^^ddd => aaa uuu "yyy"^^eee
> where L2V(I(ddd))(xxx) = L2V(I(eee))(yyy)
> """
>
> Section 11:
> Make it a subsection of "RDF semantics"? "RDF entailment"?
>
>
> Section 12:
> Make it a subsection of "RDF semantics"? "RDF entailment"?
>
> Section 12.1:
> Group the rules together, as in Section 14.1
>
> Section 13:
> Make it a subsection of "RDFS semantics"? "RDFS entailment"?
>
> Section 14:
> Make it a subsection of "RDFS semantics"? "RDFS entailment"?
>
> Section 15:
> "plus an optional default graph" -> the default graph is not optional, there
> must be exactly one
>
> Appendix A:
> "follows exactly the terms used in [OWL2-SYNTAX]" -> it is [OWL2-PROFILES],
> in Section 4.3. OWL2-SYNTAX does not rely on RDF triples
> "Every RDF(S) closure, even starting with the empty graph, will contain all
> RDF(S) tautologies" -> not all, the closure as defined is finite, while
> there are infinitely many tautologies. All tautologies concerning the
> vocabulary of the initial graph, union the tautologies in the RDF and RDFS
> vocabularies.
>
> Appendix C:
> The proof that every graph is satisfiable does not need introducin Herbrand
> interpretation and does not need to build an interpretation for each graph
> considered. There is a single interpretation that makes all RDF graph simply
> true. Consider a domain comprising only one element x. Map all IRIs and
> literals to x, including those used as predicates. Make the IEXT of x be the
> single pair {(x,x)}. This simply satisfies all graphs.
>
> Appendix D.1:
> "The subject of a reification,/a>" -> typo
>
> Appendix D.2:
> The RDF container vocbulary should also mention rdfs:member,
> rdfs:containerMembershipProperty.
> --
> Antoine Zimmermann
> ISCOD / LSTI - Institut Henri Fayol
> École Nationale Supérieure des Mines de Saint-Étienne
> 158 cours Fauriel
> 42023 Saint-Étienne Cedex 2
> France
> Tél:+33(0)4 77 42 66 03
> Fax:+33(0)4 77 42 66 66
> http://zimmer.aprilfoolsreview.com/
Received on Wednesday, 12 June 2013 16:04:39 UTC