Re: Review RDF 1.1 Semantics (ED 3rd June 2013)

Pat,


Please, before answering comments one by one while reading, please read 
on. I myself was guilty to answer your comments before reading all 
through, and I think I would have answered differently otherwise.

I mean: I do not change my mind, but some of the arguments I give later 
can reinforce the arguments I give sooner.
I would also appreciate Peter to read and ponder my comments too, and 
perhaps give an opinion.

What is at stake here is the possibility of a formal objection.



AZ.

Le 14/06/2013 08:32, Pat Hayes a écrit :
>
> On Jun 12, 2013, at 5:53 AM, Antoine Zimmermann wrote:
>
>> Pat, Peter,
>>
>>
>> This is my review of RDF 1.1 Semantics. Sorry for sending this so
>> late. On the plus side, I'd say that overall, the presentation have
>> been much improved, interpretations being independent from a
>> vocabulary is a big bonus, making D-interpretations independent
>> from the RDF vocabulary is also much better. Putting the rules in
>> context with the corresponding entailment regime is also good.
>>
>> Now, for the main criticism, I have two outstanding problems with
>> the current version: 1. D-entailment using a set rather than a
>> mapping; 2. Define entailment of a set as entailment of the union.
>>
>>
>> 1. D-entailment ===============
>>
>> Concerning 1, the implication of the new definition is that given a
>> D, it not generally possible to know what are the valid
>> D-entailments.
>>
>> For instance, consider D = {http://example.com/dt}. What does the
>> triple:
>>
>> <s> <p> "abc"^^<http://example.com/dt> .
>>
>> D-entails? The specification does not say.
>
> How can it? Those entailments must depend on what is known about the
> datatype and its value space.

That is the problem.

>
>> Moreover, because of the absence of a known mapping from IRIs to
>> datatypes, there are a few ill-defined conditions: For instance, in
>> Section 9, the table "Semantic conditions for datatyped literals"
>> says:
>>
>> """ For every other IRI aaa in D, and every literal "sss"^^aaa,
>> IL("sss"^^aaa)=L2V(I(aaa))(sss) """
>>
>> L2V is only defined for datatypes, whereas I(aaa) is not
>> constrained to be a datatype. Even though it was constrained to be
>> a datatype, this would not define the value of IL("sss"^^aaa),
>> unless aaa is one of the normative XSD datatype IRIs.
>
> True, I will adjust the wording to cover the anomalous case.
>>
>> In any case, no matter how you tweak the definitions, the
>> application MUST have a mapping from the set of "recognised"
>> datatype IRIs to some specific datatypes.
>
> If the application has no such mapping then it is not able to treat
> literals with that type in any special way, so they get treated
> exactly as though they were an unknown name, which is how it would be
> treated in simple interpretations.   So it is not true that the
> application MUST have this mapping: only that if it does not, then
> the presence of this datatype IRI does not change any entailments.
> Which is what the "recognized" terminology is supposed to suggest.

Well Ok, an application is not required to have any of the mappings, but 
an application that implements {http://example.com/dt}-entailment must 
have a mapping from http://example.com/dt to a datatype.


> Exactly the same point could be made regarding the 2004
> specification. The semantics there referred to a datatype map, but
> the syntax of RDF did not provide any way to describe or denote this
> map.

RDF Semantics does not define how to communicate the intended entailment 
regime in the syntax. It simply defines those regimes (well, it used to 
define them, it does not properly in the current RDF 1.1 Semantics draft).

Protocols to communicate the entailment regime to be used can be 
standardised separately from the formal definition of the regimes.


> When provided with some RDF containing literals typed with
> http://example.com/dt, there is nothing that defines the datatype map
> that is supposed to be used on this RDF. And both in 2004 and here,
> if you are faced with an IRI typing a literal and you do not know
> what datatype it is supposed to be denoting, then you simply treat
> the literal as you would any other unknown name.

Still, if I know what RDF-2004-entailment-regime is implemented in an 
application, I know exactly and non ambiguously what I can infer from:

<s> <p> "abc"^^ex:dt .

While in the current ED, if I'm told the application is implementing 
{ex:dt}-entailment, I do not have a clue what to infer from the statement.


> What one should do, of course, faced with a literal typed with an
> unknown IRI, is to use that IRI itself to try to find out what it
> identifies. In other words, you should try to find out what its
> **denotation** is, because that will be the datatype. Which is
> exactly what the current account of datatypes suggests you should
> do.

RDF Semantics is not standardising what has to be done to figure out 
these things. This is a matter of other documents or protocols that are 
based on RDF.


>>
>>
>> Later in Section 10, it says that "if D < E and S E-entails G then
>> S D-entails G." Since no constraints are given on how to interpret
>> "recognised" non-XSD datatype IRIs, it is possible that the same
>> IRI in D-entailment is interpreted differently in E-entailment.
>
> There is normative prose to ensure that this cannot happen.

To what prose are you referring to?

>>
>> In Section 11, table "RDF semantic conditions:
>>
>> """ For every IRI aaa in D, <x,I(aaa)> is in IEXT(I(rdf:type)) if
>> and only if x is in the value space of I(aaa) """
>>
>> This is ill-defined because the value space of I(aaa) may not
>> exist.
>
> In which case, x cannot be in the nonexistent value space, so the
> definition says that <x, I(aaa)> is not in the extension, so any
> triple
>
> xxx rdf:type aaa .
>
> is false.

The definition does not imply that it is false. The definition is just 
ill-defined because it does not say what has to be done when I(aaa) is 
not a datatype and it is trying to apply a function to a thing that does 
not belong to its domain. To recover well-definedness, you must have a 
partial function over the set of resources that associate some resources 
(the datatypes) to a value space, and nothing to the rest. But you have 
a simpler option, which is to have a datatype map.


>> Again, even if I(aaa) is constrained to be a datatype, how do we
>> know what is its value space? Therefore, the condition cannot be
>> verified in general.
>
> In order to know the value space of any datatype, we have to appeal
> to external sources. There is a presumption that datatypes are
> identified by IRIs, which seems reasonable in the RDF context.

A datatype map is providing the datatype itself, therefore it provides 
the value space. There is no need to appeal to external sources if you 
know what datatype map you are using. You must appeal to external 
sources when all you know is that {http://ex.com/dt} is recognised.


>>
>> Finally, the reasons why this change has been made are unclear.
>
> It was an editorial decision, to make the exposition simpler and
> easier to understand. (I would add that putting datatype maps into
> the semantics in the first place was also an editorial decision.) It
> does not materially change the semantics and does not affect any
> entailments.

It's not editorial, it changes what can be concluded from being a 
D-entailment. Basically, it cancels the conclusions that you could do 
when knowing the map D. Now you have to know D as well as some external 
things that are not indicated in the formal definitions.


>> The working group was not chartered to do anything about that, the
>> workshop in 2010 did not point at all to any problems with datatype
>> maps, this Working Group did not discuss or complained about the D
>> being a map when the change was made. No prior discussions were
>> attempted before making the change.
>>
>> Implementations that rely on custom datatypes are interpreting the
>> custom datatype IRIs according to one specific, known datatype,
>> therefore, they do have a datatype *map* implemented.
>
> Or, you could say the same thing by saying, they have a fixed
> interpretation of the custom datatype IRIs.

Sure. But you must have the association between the IRI and that fixed 
interpretation in order to know the entailments. Precisely, presented in 
one way or another, you need a mapping, somehow. RDF 2004 makes this 
explicit in the specification of all D-entailment regimes. *This* is 
clearer to me.


> Which is a natural and
> IMO clearer way to express the same thing.
>
>> There is zero motivation to make such a change.
>
> It simplifies the exposition by removing an irrelevant and confusing
> complication, making the semantics (marginally) easier to follow.

It does not simplify. It obscures it by having the required mapping 
being evoked outside the formal definitions. The mapping is still 
required but the requirement is hidden. It is not even clear whether the 
requirement, indicated in prose somewhere, actually holds in the 
definition of D-entailment.


> As
> it has been repeatedly asserted that understanding the arcane
> specification documents is the greatest barrier to RDF deployment,
> this is not a trival motivation.

It is not arcane, it is a *mapping*, a simple math structure that 
everybody meets at school.

>
> The 2004 mode of presentation had its problems as well. It was quite
> possible, for example, to consider a D which mapped xsd:string to,
> say, rdf:PlainLiteral. There was nothing to constrain the 'standard'
> IRIs to map to their "obvious" meanings, or to the meanings obtained
> by conventional Web methods such as following HTTP links.  Richard
> had already noted how crazy this was in one of this blogs about a
> year ago, and there was considerable comment to the effect that this
> ought to be fixed.

This has nothing to do with D being a mapping. The problem is fixed. The 
xsd: datatypes must refer to their respective normative datatypes.

I repeat: there is zero motivation to make such a change.


If you are not convinced after all that, we'll have to decide with a 
vote. If the vote is on your side, we'll proceed to Last Call and future 
phases with a formal objection from my side.


>>
>
>> 2. using union ============== This issue is different from the
>> previous one because it does not make the definitions and
>> propositions incorrect.
>>
>> I see two problems with the new definition: first, it makes the
>> notion of entailment in RDF different from the standard,
>> universally accepted notion of entailment in logic.
>
> It is completely standard to treat a set of assertions as equivalent
> to its conjunction.

I do not know how you've learnt about conjunction, but for me the 
conjunction of A and B is true iff A is true and B is true. This is what 
used to be in RDF 2004 and this is not what is in RDF 1.1.


>> In general, no matter what semantics is considered entailment is
>> defined as follows:
>>
>> """ A set S of formulas in the language entails a formula F in the
>> same language if and only if all interpretations that satisfy all
>> the formulas of S also satisfy F. """
>>
>> That's what was in RDF 2004, that's what's in OWL, that's what's in
>> any logic with a model-theory.
>
> The key point is that in any normal logic, these two ways of phrasing
> are exactly equivalent,

Indeed!


  so the choice between them is purely
> aesthetic. But in RDF, now that we explictly allow two different
> graphs to share a bnode,

It always has been allowed, in spite of you constantly claiming the 
contrary.


  this equivalence is rather trickier.
> Basically, we now allow in RDF a situation where we can conjoin
> (union) graphs both inside and outside quantifier scopes.

No. YOU allow this strange situation by making a set of RDF graphs 
equivalent to their union.


> This is not
> a normal situation in any conventional logical syntax,

Absolutely.


> so we are to
> some extent on our own, and appealing to what is "normal" isn't
> helpful.

RDF is perfectly coherent with "normal" logics, as long as we keep the 
"normal" definitions (viz., that the conjunction of A and B is true 
whenever A is true and B is true).


>> There are also inconvenient consequences for manipulation of RDF
>> graphs: how is it supposed to be implemented? Assume we have two
>> representations of two graphs. How do you know what's the union of
>> the two graphs?
>
> First you use the scoping rules on bnodeIDs to make sure that you
> keep all the various blank nodes distinct. Then you take the union of
> the sets of triple representations. Exactly like you do now.
>
>> You do not have access to the bnodes, only to identifiers or
>> locations in files or in memory. There is a rule of thumb saying
>> "different documents, different bnodes".
>
> I prefer to talk about identifier scopes rather than documents, as it
> is more precise. But yes.
>
>> And what about RDF graphs in an in-memory model?
>
> Again, I presume that the model somehow specifies identity of blank
> nodes. In which case, what is the problem?
>
>> What about two examples of RDF graphs in Turtle in a written
>> article? They are in the same document, they certainly share
>> bnodes, right?
>
> Well, that depends upon what the text of the article says about its
> use of blank node identifiers, presumably, unless it is using a
> diagrammatic network represetnation, in which case identity of blank
> nodes can be checked visually.
>
> Your point about not having access to the actual blank nodes applies
> equally to ANY definition of ANY way of combining graphs. Go back in
> time to 2004 and imagine two RDF/XML documents coming from
> independent sources which do not use the same bnodeIDs anywhere. They
> each describe an RDF graph. Do those graphs share blank nodes? How
> could you possibly tell?

Indeed.

  If they did, how would you fix that
> situation?

If the truth of the two graphs is equivalent to the truth of the merge, 
I have no reason, in general, to even bother whether they share bnodes 
or not. However, if I want to consider the union of the two graphs, I 
must know whether or not they share bnodes. It's still problematic 
because I *may* want to perform union anyways, but at least I know I 
have an operation that preserves truth with which I don't have to know.


> You don't have access to the *actual blank nodes*, only to
> some bnodeIDs in some surface (document) syntax. You can just tell
> the abstract-syntax story and say, make sure they don't share any
> bnodes before combining them, as we did in 2004 (without saying HOW
> you were supposed to do this.) The problem is, that blanket
> merge-not-union rule is now wrong in some cases.

No.

> The graphs really
> can share bnodes, now, and then merging them would lose information.

If you know what bnodes are shared, and you want to do union instead of 
merge, no problem. That case was even presented in RDF 2004. But merge 
preserves truth, not adding any knowledge, and as said in the 
introduction of RDF Semantics, it is the matter of this very specification.


> If they don't share blank nodes, then merging and unioning are the
> same operation; if they do share a blank node, then taking the union
> is the correct thing to do, because the merge loses the information
> about the sharing.  Either way, unioning is correct.

What you assume is that a given blank node indicates the existence of 
the same thing across all possible graphs. In a way, it's like an IRI 
whose actual name is unknown. It denotes. But this is not how bnodes are 
defined.

Compare FOL formulas: given the variable x, taken from te infinite set 
of variables, consider the formulas:

∃x P(x)
∃x Q(x)

These formulas share a variable. From these two formulas, I conclude:

∃x∃y P(x),Q(y)

I have to separate apart the existentials to make it into a single 
formula. If, by chance, you have extra knowledge that the x having 
property P is precisely the x that has property Q, you could simplify 
this into:

∃x P(x),Q(x)

But it is saying more than the initial formulas. Bnodes are 
existentials, so if you have two graphs that share the bnode b:

{(b,<p>,<o>)}  and  {(b,<q>,<u>)}

then you can define their meaning as:

∃x Triple(x,<p>,<o>)
∃x Triple(x,<q>,<u>)

Trivially, the conjunction of those graphs ought to be:

∃x∃y Triple(x,<p>,<o>),Triple(y,<q>,<u>)

If it's not, then there is something really broken in RDF Semantics.


>> Now if we take the simple case when the application is able to
>> determine that the bnodes are disjoint, how can it perform a union?
>> The answer is that it must *separate apart* the bnode identifiers.
>
> All this is about the blank nodes *themselves*, not the bnodeIDs.
> BnodeIDs are purely a surface syntax matter. There are no bnodeIDs in
> the abstract RDF graph syntax. And to perform a union, you just,
> well, take the union of two sets. What could be simpler or more
> obvious?
>
>> So, while in 2004 there was a coherence between the way merge was
>> defined and the way it has to be implemented, now there is a
>> discreprency between the definition and the pratice.
>
> It is (and it always was) important to keep the levels clearly
> distinct. BnodeIDs are part of a surface syntax and obey the scoping
> rules of that surface syntax. Those rules determine when two bnodeIDs
> identify the same bnode, and (implicitly) when they don't. Once that
> is all worked out, then we define operations on the graphs (not on
> the surface documents), and it is these operations on the graphs that
> we are describing here. In 2004, you could do all this surface
> bnodeID standardizing apart, and you could STILL be left with a
> situation where the actual blank nodes needed to be "standardized
> apart" even after all the bnodeIDs had been dealt with.

What? I don't understand what you are saying, but there are no APIs that 
are unable to perform a merge correctly.

> Which is
> ridiculous, but we had to say it because there was nothing in the
> 2004 RDF abstract syntax model that ensured that distinct graphs from
> unrelated sources did not accidentally share blank nodes.

RDF graphs are sets. They don't have sources.


> We have now
> made that clear, so there is no need to protect against "accidental"
> bnode identities; and we have also made it possible for distinct
> graphs to genuinely share bnodes, so we need to allow this case to be
> handled correctly. Both of these mean that the union, rather than the
> merge, is the correct way to combine graphs. If two graphs really do
> share the same blank node, then they should stay being that blank
> node, because then their union faithfully represents the larger graph
> of which they are parts.

This is saying that bnodes denote a unique entity (though unnamed) 
across all graphs. This is not the semantics of bnodes and it's 
incorrect. It means that bnodes cannot be replaced without changing the 
meaning. It means they do not merely indicate the existence of a thing.

>
> I will write a paragraph or section to try to make this all very
> obvious.
>
>> Then, once the separation apart is made to produce a representation
>> of the union, the created graph is, by definition of union, sharing
>> bnodes with the two original graphs. But how can the overlap of
>> bnodes be recognised in and out of the application? One would need
>> meta-information about the relationship between the graphs.
>
> No need to call it meta-information, but any dataset surface notation
> could create a situation where two distinct named graphs in a dataset
> share a common blank node. I guess this would probably be handled by
> the dataset surface syntax rules for identifier scopes.
>
>> And how to represent and store that relationship?
>
> Ask the implementors of your dataset-representing system :-)
>
>> Also, if one wants to keep two graphs that share bnodes separate
>> (say, they are distinct graphs in the same TriG files). Then these
>> graphs cannot be stored separately if one wants to retain
>> equivalent inferences on the set of graphs. That is, if I have
>> {G1,G2} such that G1 and G2 share some bnodes, storing G1 apart
>> would create a "copy" of G1 with disjoint bnodes. The new graph,
>> H1, would be equivalent to G1, but the set {H1,G2} would not yield
>> the same entailments as {G1,G2}.
>
> Right, because you would have lost some information when you
> performed that separation. But note, if you insist upon enforcing
> graph merging rather than union, you would have lost this information
> even when using the dataset.

No, one graph indicate the existence of some things, the other too. The 
things in existence do not need be the same (like the same variable used 
as an existential in different formula does not need indicate the 
existence of the same thing for formula1 and formula2). If I store those 
two files, they continue to each indicate as much existing things as before.


  In fact. if we insist upon merging
> rather than unioning graphs, there really is no point in allowing
> graphs to share bnodes in a dataset, since graph operations will
> "un-share" them.

No, union does not unshare them. There are triplestores that perform 
reasoning over the union of the named graphs. This is allowed by our 
non-existing dataset semantics.


>> Finally, the decision to replace merge with union was first put
>> into the document without prior discussion with the Working Group,
>> without evidence that it follows practices,
>
> It clearly does, since it is routine to take the union of graphs. In
> fact, this is by far the most common operation in RDF. All entailment
> rules for example union (rather than merge) consequences into a
> graph, and unioning rather than merging is required inside datasets.

Union is one operation among many on graphs. Delete or add triples is 
another. Merge is another. The difference is that merge preserves truth.

>
>> without evidence that it solves known issues.
>
> It solves an issue with the 2004 definitions that I have been noting
> ever since the WG began (and in fact before then).

So?

>
>> The notion of merge was not identified as a subject of concern
>> during the W3C workshopin 2010. Implementations do implement the
>> RDF 2004 correctly.
>
> I do not think that there is a single RDF implementation that
> implements the 2004 notion of merge. Remember, it is not talking
> about standardizing apart bnodeIDs (that is indeed often done) but
> blank nodes *themselves*. If you can find me any RDF engine that does
> this, I will buy you a good steak dinner.

Ahah, this is trivial to fix.
In fact, RDF 2004 defines *a* merge of two graphs, and it is not 
required that the final graph has bnodes disjoint with the bnodes of the 
original graphs. However, if the original graphs do not share bnodes, 
indeed the result must be the union, if we follow the definition 
strictly. First, it is easy to fix if we really want to be pedant, and 
second, in any case, the operations performed by implementations 
preserve truth and build graphs isomorphic to a merge. That is what matters.
What is clear, however, is that they do not perform union.


>> Conclusion: =========== More generally, any change like this is
>> disturbing education. If this design is standardised at the end of
>> the year, there will be a gap between what's in the standard and
>> what has been written for years in tutorials, courses, research
>> papers, and so on.
>
> Actually I think there will not be. Tutorials, courses, and
> implementations have all considered standardizing apart to be an
> operation applied to blank node *identifiers*, even when they used
> the blank-node terminology. This has been a source of muddle and
> confusion ever since the original specs were published, in fact. The
> current way of describing things helps to keep the situation
> clearer.

I claim they do not, but as it is a question of taste, I won't try 
arguing about the clarity.


>>
>> Considering that I see no added value compared to 2004 from both
>> these changes, and having even identified flaws, I oppose
>> publication of RDF 1.1 Semnatics in such a state.
>
> To reproduce the merge language from the 2004 documents, without
> making corrective changes elsewhere (eg to the semantic rules for
> bnodes) would now be an error, and would make the specifications
> internally incoherent.

What do you mean by "the semantic rules for bnodes"? Do you mean the 
"Semantic conditions for blank nodes"? These ones actually support my 
claims. You may want to modify them to support your claim. For instance, 
try:

"""
If E is an RDF graph then I(E) = true if [I+A](E) = true for some 
mapping A from the set of all blank nodes to IR, otherwise I(E)= false.
"""

Remove the mention of bnodes of E, and you'll have your "union" 
semantics (note that I am opposed to this, but at least, it would make 
all your claims coherent).


  That is not an option. IMO this change is the
> simplest, most conventional and clearest way to correct the confusion
> in the 2004 specs.

Considering that I am pretty certain that you won't change your mind, we 
will then have a second formal objection if this design goes through the 
next phases.

>
>> Note that the solution I propose is simple and simpler than what is
>> proposed: to go back to the old design concerning entailment of a
>> set of graphs and datatype map. My proposal is also less likely to
>> trigger unsupportive comments in the Last Call phase. We cannot
>> aford to spend more time in inventing new design.
>>
>>
>>
>> Minor remarks: ============== I think there are too many sections.
>> Simple interpretations and simple entailment can be subsections of
>> a common section. The same for D-interpretations and D-entailment.
>> Same for RDF interpretations and RDF entailment; same for RDFS.
>
> You might be right. I will try making a simplified version with this
> more compact presentation style throughout.
>
>>
>> Section 3: """For example, RDF statements of the form:
>>
>> ex:a  rdfs:subClassOf  owl:Thing .
>>
>> are forbidden in the OWL-DL [OWL2-SYNTAX] semantic extension."""
>>
>> -> This triple can be a valid part of an OWL 2 DL ontology. A
>> better example would be:
>>
>> ex:a  rdfs:subClassOf  "Thing" .
>>
>> Moreover, perhaps a reference to OWL 2 mapping to RDF graphs [1]
>> would be better, since [OWL2-SYNTAX] defines OWL 2 ontologies in
>> terms of a functional syntax that does not say anything about the
>> constrains in the RDF serialisation.
>
> Good point, I will change the reference.
>
>>
>> Section 4: "A typed literal contains two names" -> We do not have
>> the notion of typed literals since all literals are typed. "Two
>> graphs are isomorphic when each maps into the other by a 1:1
>> mapping on blank nodes." -> this is very much underspecified. There
>> are other constraints on isomorphisms.
>
> ?There are? What are they?

See RDF 1.1 Concepts, Section 3.6.

>
>> "Graphs share blank nodes ... of distinct blank nodes." -> this
>> discussion should not be here. In fact, it should rather appear in
>> Concepts.
>
> I am happy as long as it appears somewhere. Several people have
> suggested putting some of this material into Concepts. but after
> having it marked as an Issue for several drafts I just gave up on it
> in order to move forward to LC.
>
>> In any case, it does not belong to notation and terminology. "This
>> document will often treat a set of graphs as being identical to a
>> single graph comprising their union, without further comments." ->
>> if my concerns above are taken into account, this should be
>> removed. A definition of merge should be added instead. By the way,
>> I haven't seen many sets of graph being treated as a single graph.
>> Actually, I think I only saw it twice. So we cannot say "often".
>
> I will remove "often"
>
>>
>> Section 5: Make it a subsection of "Simple semantics"? "Simple
>> entailment"? "a function from expressions (names, triples and
>> graphs) to semantic values:" -> what's a "semantic value"?
>
> Good question. I will rephrase.
>
>> "triple s p o then ..." -> why not "triple (s, p, o)" ?
>
> Yes
>
>> Same remark in item 4 of Section 5.2
>>
>> Section 6: Make it a subsection of "Simple semantics"? "Simple
>> entailment"? "a graph G simply entails a graph E when every
>> interpretation which satisfies G also satisfies E, and a set S of
>> graphs simply entails E when the union of S simply entails E" ->
>> change this to "a set S of graphs simply entails a graph E when
>> every interpretation which satisfies all graphs in S also satisfies
>> E" Remove the Change Note.
>
> No. See above. (There is an alternative approach, which is to define
> the bnode truth conditions to apply to bnode scopes rather than to
> graphs. An earlier edit did have this option, but it was removed
> after objections from PFPS; and you have argued against the notion of
> scope. )

Not against the notion of scope in general, but the way it was defined 
and its consequences on semantics. I still support having a discussion 
on scope in Concepts (perhaps informative to avoid conflicts in how it 
ought to be formalised).

>
>> Section 6.1: "the inference from (P and Q) to P, and the inference
>> from foo(baz) to (exists (x) foo(x))." -> the notation "(P and Q)"
>> etc is rather obscure in this context.
>
> Indeed. I had intended to remove this, and I will.
>
>> Perhaps it would be good to present the usual First Order Logic
>> translation of the semantics. BTW, the usual FOL translation would
>> not be valid for entailments over a set of graphs because
>> {FOL(G1),FOL(G2)} is equivalent to FOL(merge(G1,G2)).
>
> But this FOL map is no longer correct when the graphs share blank
> nodes. In any case, this is overkill for the presentation at this
> point. The analogy to FOL is intended only to be helpful to some
> readers in passing, and I now think it is causing more harm than
> good.

The FOL construction is still correct, and ought to be correct.
Do I really need to write the proofs to show you?

If I can avoid losing time doing so, I would appreciate that you check 
it yourself. Or check with PFPS.

>
>> The example with ex:a ex:p _:x is confusing RDF graphs and RDF
>> documents, as well as bnodes and bnode identifiers.
>
> Standardizing apart is purely a bnodeID concern. You do this when you
> might accidentally conflate bnodes coming from different sources just
> because the local IDs happen to coincide. But when, after you have
> done due care with bnodeIDs, you find that two graphs really do share
> an actual blank node, then (as this example tries to illustrate) you
> should NOT separate ithat node into two blank nodes, because that
> loses information.
>
> I will try to find a way to make the wording clearer.
>
>> Then, while the naive readers would intuitively imagine that taking
>> the union of the two triples would simply amount to putting them
>> together, they realise that they have to "standardise apart" the
>> bnode identifiers.
>
> Well, as in this example, they don't always have to. But if they do,
> the result loses information and is no longer exactly equivalent to
> the graph they started with. I think this is easy to understand. The
> real loss of information happens when a blank node shared between two
> graphs is separated into two blank nodes.

The loss of information is about syntactic information. You also lose 
information in that sense when you replace "2.0"^^xsd:decimal by 
"2"^^xsd:decimal . What matters is the indication of the existence of a 
thing with certain properties, not who made the indication.

>
>>
>> Section 7: "For any graph H, if sk(G) entails H then there is a
>> graph H' such that G entails H' and H=sk(H')" -> this should rather
>> be: "For any graph H, if sk(G) entails H then there is a
>> skolemization sk'(H) of H such that G entails sk'(H)"
>
> No, because sk'(H) would use different Skolem vocabulary, so sk'(H)
> =/= sk(H')

It has to use a different Skolem function because H may not share any 
bnodes with G: "sk is a skolemization mapping from the blank nodes in G"

>
>>
>> Section 8: Remove the second change note, as per my concerns
>> above. "datatype d refers to (denotes) the value" -> why not just
>> say "denotes"
>
> Yes.
>
>> "L2V(d)(string)" -> rather, L2V(d)(sss) "rdf:plainLiteral" ->
>> "rdf:PlainLiteral"
>
> OK
>
>> "the datatype it refers to MUST be specified unambiguously" -> yes,
>> there MUST be a mapping from datatype IRIs to datatypes, i.e.,
>> there must be a datatype map. This is a MUST, why doesn't it appear
>> as a constraint in the formal semantics?
>
> The datatype map is just the interpretation mapping restricted to the
> vocabulary of recognized IRIs. We have to allow for the case where an
> IRI is used as a datatype IRI but its interpretation isnt known to
> us. ²We could make this illegal in some way, but if its legal (IMO as
> it should be) then what do we say, semantically? The stadnard way to
> handle lack of information in model theory is to encode it as
> multiple satisfying interpretations. So we allow I(ddd) to vary, to
> represent the case where we don't know what ddd actually means. The
> trouble is, however, that this way of finding out what it means is by
> a mechanism which is *outside of*  and *invisible to* the model
> theory itself. This situation does not arise in conventional logical
> semantics, but it is central here on the Web. So we have to appeal to
> conditions which are not stated and are not even expressible in the
> formal equations. We can't give formal truth conditions for "being a
> recognized datatype IRI".

When I read this, it seems that there is a formal requirement that 
imposed to us to reject datatype maps at all cost, no matter what. The 
requirement is so strong and so unavoidable that even if datatype maps 
solve all the problems that I mentioned, you won't use it. This is 
spending a huge amount of time, and blocking transition, in order to fix 
a design that simply does not work better than the old one.


> If we say that I(ddd)=d for some fixed datatype d, then we have in
> effect said that this IRI denoting its datatype is logically
> necessary, a tautology. Which is incorrect and confusing.

If the entailment regime is D = {ddd,d} then it aught to be a tautology 
in that entailment regime. If the entailment regime is just {ddd}, it's 
not possible to say what's I(ddd) in general, so you need extra 
knowledge saying that ddd is denoting datatype d, and that extra 
knowledge will make it necessary that I(ddd)=d, which amounts to the 
same thing. So why making the change in the first place?


> What
> actually determines the datatype identified by an IRI is the state of
> the Web, which is something altogether outside of model theory, and
> we don't have any formal way to refer to it.

That's why defining the regimes according to a map that non-ambiguously 
defines what are the valid entailments with literals, and letting users 
or applications decide how they select the right regime, seems to me the 
most appropriate way to stay away from these issues.

> Which is why I think
> just leaving it be the denotation of the datatype IRI is exactly
> appropriate.
>
>>
>> Section 9: Make it a subsection of "D-semantics"? "D-entailment"?
>>
>> Section 10: Make it a subsection of "D-semantics"? "D-entailment"?
>> "a set S of graphs (simply) D-entails or entails recognizing D a
>> graph G when every D-interpretation which makes S true also
>> D-satisfies G." -> "a set S of graphs (simply) D-entails a graph G
>> when every D-interpretation which satisfies all graphs in S also
>> D-satisfies G."
>>
>> Section 10.1: why not put the general rule for datatype
>> entailment: """ aaa uuu "xxx"^^ddd => aaa uuu "yyy"^^eee where
>> L2V(I(ddd))(xxx) = L2V(I(eee))(yyy) """
>
> Good idea.
>
>>
>> Section 11: Make it a subsection of "RDF semantics"? "RDF
>> entailment"?
>>
>>
>> Section 12: Make it a subsection of "RDF semantics"? "RDF
>> entailment"?
>>
>> Section 12.1: Group the rules together, as in Section 14.1
>>
>> Section 13: Make it a subsection of "RDFS semantics"? "RDFS
>> entailment"?
>>
>> Section 14: Make it a subsection of "RDFS semantics"? "RDFS
>> entailment"?
>>
>> Section 15: "plus an optional default graph" -> the default graph
>> is not optional, there must be exactly one
>
> Ah, OK.
>>
>> Appendix A: "follows exactly the terms used in [OWL2-SYNTAX]" -> it
>> is [OWL2-PROFILES], in Section 4.3. OWL2-SYNTAX does not rely on
>> RDF triples
>
> OK
>
>> "Every RDF(S) closure, even starting with the empty graph, will
>> contain all RDF(S) tautologies" -> not all, the closure as defined
>> is finite, while there are infinitely many tautologies. All
>> tautologies concerning the vocabulary of the initial graph, union
>> the tautologies in the RDF and RDFS vocabularies.
>
> Yes. I will rewrite this.
>
>>
>> Appendix C: The proof that every graph is satisfiable does not need
>> introducin Herbrand interpretation and does not need to build an
>> interpretation for each graph considered. There is a single
>> interpretation that makes all RDF graph simply true. Consider a
>> domain comprising only one element x. Map all IRIs and literals to
>> x, including those used as predicates. Make the IEXT of x be the
>> single pair {(x,x)}. This simply satisfies all graphs.
>
> Yes, it does. But I wanted to use the Herbrand interpretation idea in
> another proof. Hmmm, I will think about this.
>>
>> Appendix D.1: "The subject of a reification,/a>" -> typo
>
> OK
>>
>> Appendix D.2: The RDF container vocbulary should also mention
>> rdfs:member, rdfs:containerMembershipProperty.
>
> Those are in RDFS, not RDF. Their meaning requires using RDFS classes
> and rdfs:subProperty, respectively.

Ok.


>
> Pat
>
>
>
>> -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École
>> Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel
>> 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03
>> Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/
>>
>
> ------------------------------------------------------------ IHMC
> (850)434 8903 or (650)494 3973 40 South Alcaniz St.
> (850)202 4416   office Pensacola                            (850)202
> 4440   fax FL 32502                              (850)291 0667
> mobile phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>
>
>
>
>
>

-- 
Antoine Zimmermann
ISCOD / LSTI - Institut Henri Fayol
École Nationale Supérieure des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 66 03
Fax:+33(0)4 77 42 66 66
http://zimmer.aprilfoolsreview.com/

Received on Friday, 14 June 2013 16:26:31 UTC