Factoring of entailment regimes (was: Re: Ill-typed vs. inconsistent?) from Richard Cyganiak on 2012-11-15 (public-rdf-wg@w3.org from November 2012)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Thu, 15 Nov 2012 19:28:11 +0000
To: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>, Antoine Zimmermann <antoine.zimmermann@emse.fr>
Cc: RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <B252CF07-8BED-43BF-A17D-85DAE7121533@cyganiak.de>
Pierre-Antoine, others,

You bring up the question what “language features” should appear in what entailment regime.

The current factoring is (if I'm not mistaken):

1. Simple entailment
   - URIs
   - blank nodes
   - plain literals

2. RDF entailment
   - rdf:Property
   - rdf:XMLLiteral
   - Other RDF vocabulary (by doing nothing)

3. RDFS entailment
   - Classes, domain, range, subclass, subproperty

4. D-entailment
   - Datatype map
   - Typed literals
   - Datatypes-as-classes

This likely will have to change a bit, right? Plain literals are gone, rdf:XMLLiteral is no longer special, rdf:langString has been introduced. Also, it has been suggested that typed literals should be introduced “earlier”, without requiring all of RDFS and friends first. This has sometimes been called “LV-entailment”. So this suggests to me:

1. Simple entailment
   - IRIs
   - blank nodes
   - rdf:langString

2. LV-entailment
   - Datatype map
   - Typed literals (automatically covers rdf:XMLLiteral)

3. RDF entailment
   - rdf:Property

4. RDFS entailment
   - Classes, domain, range, subclass, subproperty

5. D-entailment
   - Datatypes-as-classes (incl. datatype-as-range, type clashes)

Now some things are ugly with this picture. First, it's strange that simple entailment would have language-tagged strings, but not plain strings (which only come in at LV). Second, I don't think that RDF entailment pulls its weight if it's just about rdf:Property. (It's also confusingly named, given that *everything* here is about RDF). So I'd suggest to collapse things a bit:

1. Simple entailment
   - IRIs
   - blank nodes
   - Datatype map
   - Typed literals (automatically covers rdf:XMLLiteral)
   - rdf:langString

2. RDFS entailment
   - rdf:Property
   - Classes, domain, range, subclass, subproperty

3. D-entailment
   - Datatypes-as-classes (incl. datatype-as-range, type clashes)

In my eyes, this has some nice features. First, it explains *all* of the RDF data model right away in Simple Entailment. All the rest is about introducing additional vocabulary and its semantics, and not about giving the proper meaning to additional basic language constructs. Second, I think bringing all of the RDF+RDFS vocabulary in at the same level matches reality better. Third, it means that *all* entailment regimes are parameterized via a datatype map, which is conceptually nice and clean, IMO.

Thinking more about it, one might even consider merging 2 and 3. Simple entailment explains the RDF data model, RDFS entailment explains RDFS.

Best,
Richard



On 14 Nov 2012, at 20:41, Pierre-Antoine Champin wrote:

> Antoine,
> 
> 
> On Wed, Nov 14, 2012 at 3:48 PM, Antoine Zimmermann <antoine.zimmermann@emse.fr> wrote:
> Le 14/11/2012 11:19, Pierre-Antoine Champin a écrit :
> > Pat,
> >
> > On Wed, Nov 14, 2012 at 8:16 AM, Pat Hayes<phayes@ihmc.us>  wrote:
> >
> >> What I still don't follow is, why anyone who understands what an
> >> inconsistency is, would even form the idea that an ill-typed literal would
> >> be an inconsistency. It's not the distinction that needs explaining, it's
> >> why anyone would treat them as similar in the first place.  Illformedness
> >> is not even in the same category as an inconsistency. Literals aren't true
> >> or false by themselves.
> >>
> >
> > I think the divergence of opinion comes from the fact that
> >
> > * you see typed literals merely as terms (which, strictly speaking, they
> > are), and a term can not be False; it just denotes something ;
> >
> > * others (at least myself!) see a little more in them, namely: an implicit
> > assertion that the denoted thing is indeed in the value space of the
> > datatype.
> >
> > If we decide to bite that bullet, then this could be endorsed in the
> > semantic condition for a *graph*:
> >
> >    if E is a ground RDF graph then I(E) = false if I(E') = false for some
> > triple E' in E,
> >    or if I(E') is not in LV for some typed literal E' in V,
> >    otherwise I(E) =true.
> 
> Ouch. I don't like the fact that the notion of type comes in at the
> level of ground-graph simple entailment.
> 
> I don't see how my proposal above makes the notion of type more present than it was before:
> * typed literals are a subset of V, they were already there
> * LV is a distinguished subset of IR in *all* interpretation, it was already there.
> 
> I don't believe (nor intend) that the proposal above changes the result of simple entailment.
> The only change is that, in order to satisfy the following graph:
> 
>   :a :b "foo"^^xsd:integer .
> 
> an interpretation will have to verify
> 
>   IL("foo"^^xsd:integer) is in LV
> 
> As nothing, in simple entailment, can constrain LV in any way,
> nothing prevents a graph consistent with the current condition to have a satisfying interpretation that meets the condition I propose.
> 
> On the other hand, under XSD-entailment, as "foo" is not a valid lexical form for xsd:integer,
> the semantic conditions for datatypes impose to every interpretation that
> 
>   IL("foo"^^xsd:integer) is not in LV
> 
> so no XSD-interpretation can satisfy the graph above under the condition I propose.
> 
> 
> Again, what I'm trying to model is the intuition that any typed literal is claiming that its lexical form is indeed a lexical value of its datatype (in rdf-mt parlance: they denote something in LV). This claim is neutral in simple-entailment, where datatypes have no special meaning (LV is not constrained). It has some impact in D-entailment (reflected in rdf-mt by the semantic conditions for datatypes that constraing what can and cannot be in LV).
> 
> Or do you object to this intuition? I had the impression that your proposal was going that way too...
> 
> The more I think of this issue, the more I believe that ill-typed
> literals should be a syntax error. An application that supports a
> datatype should reject RDF graphs that do not write literals of that
> type properly.
> 
> That can work of course.
> But that makes RDF+XSD a sublanguage of RDF, just like OWL-DL is. 
> Worse, that makes RDF+D (with D any set of datatypes) a different sublanguage.
> Makes me feel uneasy.
> 
>   pa
> 
> 
> 
> Note that in OWL 2 Structural Specification and Functional Style Syntax,
> it is required that:
> 
> "The lexical form of each literal occurring in an OWL 2 DL ontology MUST
> belong to the lexical space of the literal's datatype."
> 
> cf. Section 5.7 http://www.w3.org/TR/owl2-syntax/#Literals.
> 
> 
> 
> AZ
> 
> 
> > The first line (from the original definition) accounts for everything
> > asserted explicitly in a triple,
> > while the second line (which I added) accounts for those "implicit"
> > assertions carried by typed literals.
> >
> > Do you think it's a clean way to do it? Or do you consider it as just
> > another "trick" ? :-)
> >
> >    pa
> >
> 
> --
> Antoine Zimmermann
> ISCOD / LSTI - Institut Henri Fayol
> École Nationale Supérieure des Mines de Saint-Étienne
> 158 cours Fauriel
> 42023 Saint-Étienne Cedex 2
> France
> Tél:+33(0)4 77 42 66 03
> Fax:+33(0)4 77 42 66 66
> http://zimmer.aprilfoolsreview.com/
> 
>
Received on Thursday, 15 November 2012 19:28:40 UTC