Re: Comments on the new RDF Model Theory spec from patrick hayes on 2002-05-13 (www-rdf-comments@w3.org from April to June 2002)

From: patrick hayes <phayes@ai.uwf.edu>
Date: Mon, 13 May 2002 01:02:45 -0500
To: Massimo Marchiori <massimo@w3.org>
Cc: www-rdf-comments@w3.org
Message-Id: <a05111706b904f0e87c5b@[65.217.30.195]>
>Long flights help to have spare time, and WWW2002 was far away....
>so here's my comments on the new MT spec
>http://www.w3.org/TR/2002/WD-rdf-mt-20020429/
>
>I just read up until 3.2.2, but as time is always scarce, I'll
>just send what I have so far rather than waiting ("on the fly",
>almost literally....! ;)
>
>Comments are structured this way:
>First, the considered section is written between "****"'s.
>Then comments occur, where the commented part of the MT
>is enclosed using <quote>,
>and the nature of the comment follows:
>EDITORIAL means an editorial remark
>WRONG means there's something wrong
>ISSUE means there's an issue
>And next, the actual comment appears.
>
>These are mostly editorial comments gathered when reading the spec,
>so no high-level architectural comments, which will come later.
>
>Executive summary: the MT is very fine, it just needs some editorial cleaning
>and some minor technical fixes. As far as low-level tech issues, just
>one (containers).
>
>Thanx,
>-M

OK, thanks for all the comments. Some of them I disagree with, 
explanations below.

>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
>**** 0.2 Graph Syntax ****
>
><quote>
>There are well-formed graphs that cannot be described by these 
>notations, however.)
></quote>
>EDITORIAL:
>What does "well-formed" means? That's a big problem thorough all the 
>draft, too many times terms are
>used without proper definitions (or, a term is used before being 
>defined, without reference).

Sigh. I guess I never know quite how far I have to go in order to 
explain terms that seem to me to simply be part of the language. If 
it bothers you, I could just omit the phrase "well-formed".

>
><quote>
>An RDF literal has three parts ( a bit, a character string, and a 
>language tag), but we will treat them simply as character strings, 
>since the other parts of the literal play no role in the model 
>theory.
></quote>
>EDITORIAL/WRONG:
>I hope this sentence is going to change, and it's just part of this 
>version of the draft, as of course there
>has to be a formal definition of what a literal is

Agree...

>(and, the MT is the place where it has to be!

....disagree. The proper place is the syntax document, which was not 
completed when this draft of the MT was written. This triplet 
structure of literals plays no role in the MT and has no relevance to 
it.

>). Saying
>the other parts "play no role" is confusing (and, formally, wrong), 
>so please in any case state it better.

I will try to say it better, but in fact it is not wrong as stated. 
The other parts of the literal have no effect on any truth-values of 
any triples.

>
><quote>
>Blank (unlabeled) nodes are considered to be drawn from some set of 
>'anonymous' entities which have no label and are unique to the graph
></quote>
>EDITORIAL:
>What's a "label"...?!

Formally, this is meaningless. Intuitively, it is intended to convey 
the intuition that blank nodes are indeed blank; they have no name.

>What does it formally mean to be "unique to the graph"...?

It is intended to convey the intuition that when an RDF document is 
parsed and a new RDF graph is created, the blank nodes in that graph 
are considered to be distinct from any blank nodes in any other 
graphs.

>
><quote>
>Finally, every arc in an RDF graph is labelled with a uriref.
></quote>
>EDITORIAL:
>What does this mean?

It is intended to give an intuitive picture of what the definitions 
mean. Formally, an RDF graph is simply a set of triples; but if that 
is all we say, then a reader might reasonably ask why we call it a 
'graph'.

>We're defining the RDF graph here: and, this is not really a "graph" 
>(so that, to solve the
>problems, we define it using triples). Here instead, you're 
>confusing the level of RDF-graph-definition, with
>the level of pictorial representation.

I disagree. The concept of a label is not a pictorial concept; it can 
be given a purely mathematical characterization.

>Please be clearer.

I will try, though the present prose is something like my fifth 
attempt to make this clear.

>
><quote>
>Two RDF documents, in whatever lexical form, are syntactically 
>equivalent if and only if they map to the same RDF graph.
></quote>
>EDITORIAL/WRONG: This is a definition that is never used later, so 
>you might consider to drop it.

It isn't a definition, but a comment. The MT itself does not refer to 
any RDF syntax other than the graph itself.

However, I had thought that it was in fact correct.

>But if you don't,
>please note that this is likely wrong as written here: this is due 
>to the fact the syntax -> graph is a relationship
>and not a map.

Can you expand on this point? You are the first person to make this 
claim, and I would like to get this point clear

>What you mean is probably to say they map to "equivalent" RDF graphs 
>(meaning, semantically equivalent).

No, I meant it in the strict syntactic sense.

><quote>
>An RDF graph can then be defined as a set of triples of the form <S, 
>P, O>, where P is a uriref, S is either a uriref or a blank node, 
>and O is either a uriref, a blank node, or a literal
></quote>
>EDITORIAL/WRONG:
>This is a multiset, if as said in 0.3, arcs are never merged. On the 
>other hand, I think you really meant set here,

Yes, I meant set. You are correct that the wording in 0.3 about 
merging arcs needs to be rewritten, thanks for catching that (it was 
left over from a previous incarnation)

>and as such, the precisation in 0.3 about arcs merging should be 
>better stated.
>Moreover, you should add the finiteness condition: An RDF graph is a 
>*finite* set (or multiset..) of triples.. etc.

There is no finiteness condition, deliberately. Imposing it would 
unnecessarily complicate the definitions of entailment, and serve no 
useful purpose. Several of the closures defined later involve 
infinite sets of triples.

>
><quote>
>The convention that relates such a set of triples to a picture of an 
>RDF graph can then be stated as follows. Draw one oval for each 
>blank node and uriref, and one rectangle for each literal, which 
>occur in either the S or O position in any triple in the set, and 
>write each uriref or literal as the label of its shape. Then for 
>each triple <S,P,O>, draw an arrowed line from the shape produced 
>from S to the shape produced from O, and label it with P.
></quote>
>EDITORIAL:
>This is rather confusing...: "rectangles"? "ovals"?

Yes, the words are being used in their ordinary English sense. I see 
no need to explain them further. Perhaps it would help if I 
reproduced one of the diagrams from [RDFMS] ?

>Please be clearer, and add a disclaimer that this is not a
>"graph" as usually intended in the literature anyway...

It depends on which literature you read; but OK, I should have such a 
disclaimer, indeed. It was in an earlier draft. Will do.

>
><quote>
>In particular, two N-triples documents which differ only by 
>re-naming their node identifiers will be understood to describe 
>identical RDF graphs.
></quote>
>EDITORIAL/WRONG:
>This is formally wrong.

No, it is exactly and formally correct. This is an important point; 
the blank nodes in the graph really are blank. They are not 'hidden' 
names.

>Here you should not say "identical" RDF graphs, but rather, define 
>equality on graphs

Formally, a graph is a set of triples, and that formally defines 
equality unambiguously: same set, same graph.

>where
>blank nodes renaming

That phrase is meaningless. Blank nodes do not have names and cannot 
be re-named.

>can occur, and then use this new equality definition when needed.
>
><quote>
>Other RDF serializations may use other means of indicating the graph 
>structure; for our purposes, the important syntactic property of RDF 
>graphs is that each distinct item in an RDF graph is treated as a 
>distinct referring entity in the graph syntax.
></quote>
>EDITORIAL:
>What does this formally means? First, "serialization" is not 
>defined. Second, the rest of the paragraph makes
>little sense.

Well, I disagree. I think it makes perfect sense. Can you articulate 
in what way it fails to convey the point?

>If we need to talk about serializations, let's give a definition and 
>state their prpoerties.
>Likely, we don't need this here, but if you feel we do, then let's 
>do it right and clearly.

We cannot define every word we use. The concept of a serialization is 
surely familiar to most readers of the RDF documentation. The 
sentence is intended only to be helpful, not to be a formal 
definition.

>
>
>**** 0.3 Definitions ****
>
><quote>
>The result of taking the set-union of two or more RDF graphs (i.e. 
>sets of triples) is another graph, which we will call the merge of 
>the graphs
></quote>
>WRONG:
>This is formally wrong, and contradicts what said after this 
>sentence (the fact blank nodes are not merged).

No, it is formally correct and does not contradict it. Read the 
definitions carefully.

>Formally define the real merge operation (and if the case, note just 
>the opposite of what written here,
>i.e. the fact the subset relationship can not hold any more when merging).

I do not follow you. Formally, an RDF graph is a set (of triples). 
The merge is the union set. How can a subset relation not hold 
between them?

>
><quote>
>and that a graph is an instance of another just when every triple in 
>the first graph is an instance of a triple in the second graph, and 
>every triple in the second graph has an instance in the first graph.
></quote>
>WRONG:
>The substitution of blank nodes must be well-defined thru the whole 
>graph (so, respecting at least node
>identities). The way you define it (triple by triple) is incorrect.

No, it is correct.  The same blank node may occur in several triples, 
and substitution is well-defined throughout the graph. Your objection 
would hold if a graph were an N-triples document.

>**** 1.1 Technical Notes ****
>
><quote>
>This might seem to violate one of the axioms of standard 
>(Zermelo-Fraenkel) set theory, the axiom of foundation, which 
>forbids infinitely descending chains of membership
></quote>
>EDITORIAL:
>Think about some poor guy who's reading this and doesn't have a math degree

Then he should skip the technical sections. In any case, speaking as 
a poor guy WITH a math degree who has been forced to read the XML 
documentation and such horrors as RFC 2396, I don't have much 
sympathy.

>... This is all stated without references,
>so some literature pointers should be added (if you can't avoid 
>mention it, then at least give people hooks ;)

I really cannot be expected to provide an entire background course in 
formal set theory. If you give "Zermelo-Fraenkel set theory" to 
Google you should get something readable.

>
><quote>
>Interpretations which share the special meaning of a particular 
>reserved vocabulary will be named for that vocabulary, so that we 
>will speak of 'rdf-interpretations' and 'rdfs-interpretations', etc.
></quote>
>What does "share the special meaning" mean? This is colloquial but 
>sloppy, please be clearer.

The rest of the document tries to make this clearer.

>**** 1.4 Denotations of ground graphs ****
>
><quote>
>Notice that if the vocabulary of an RDF graph contains urirefs that 
>are not in the vocabulary of an interpretation I - that is, if I 
>simply does not give a semantic value to some name that is used in 
>the graph - then these truth-conditions will always yield the value 
>false for some triple in the graph, and hence for the graph itse
></quote>
>EDITORIAL/WRONG:
>This is a bit dense, and should be better rewritten.

OK, I will try. However, it is not wrong.

>Also, is this actually part of the definition, or just a comment...?
>I.e., are you also here considering cases where IR is too small wrt 
>a graph? And if so, isn't it the case that
>the formal definition of the interpretation is then just 
>ill-defined, and therefore its def should be appropraitely
>modified to handle such cases?

No, it is defined. That is the point: it is defined for any IR, but 
if the vocabulary fails to refer then the truthvalue is false.

><quote>
>Turned around, this means that any assertion of a graph implicitly 
>asserts that all the names in the graph actually denote something in 
>the world
></quote>
>EDITORIAL:
>Where is "assertion of a graph" defined?

The general idea of asserting a graph is discussed at the beginning 
of section 1.3 (second para).

><quote>
>Since the universe of this interpretation contains no character 
>strings as objects, any triple with a literal object would be false.
></quote>
>EDITORIAL/WRONG:
>Same comment as the first one for 1.4 above.

And same reply.

>**** 1.5 Unlabeled nodes as existential assertions ****
>
><quote>
>1.5. Unlabeled nodes as existential assertions
></quote>
>EDITORIAL:
>This is the first place where the wording "unlabeled node" occur

It is introduced in section 0.2 as a synonym for 'blank'

>(and it is used all along
>after this), while before in the doc, there's always been "blank 
>nodes". Consistency would be better.

If you prefer. We have used both terms so much in other documents 
that it seemed best to use both of them here as well.

>
><quote>
>Notice also that the unlabeled nodes themselves are perfectly 
>well-defined entities with a robust notion of identity
></quote>
>EDITORIAL:
>"robust" notion of identity....?

Yes, robust. Clear, unambiguous, not open to doubt or shilly-shallying.

>
>
>**** 1.6 Comparison with formal logic ****
>
><quote>
>For example, the graph defined in the above example translates to 
>the logical expression (written in the extended KIF syntax defined 
>in [Hayes&Menzel])
>
>(exists (?y)(and (ex:a ?y ex:b)(ex:b ex:c ?y)))
></quote>
>
>EDITORIAL:
>Why use KIF syntax rather than well-known (and much more common) 
>first order logic formalisms?
>In any case, at least, some explanation of the syntax should be given.

If you need one, we give a reference. I would have thought it was 
pretty obvious. In any case, KIF is well-known and widely used, and 
KIF syntax *is* a first-order syntax. There is no single canonical 
first-order syntax: one can find prefix, postfix, infix, and 
graph-based syntaxes in the logical literature since Frege (1898)

>
><quote>
>The above example would then map to
>
>(exists (?y)(and (PropertyValue ex:a ?y ex:b)(PropertyValue ex:b ex:c ?y)))
></quote>
>EDITORIAL:
>Same comment as above.
>
>
>**** 2. Simple entailment between RDF graphs ****
>
><quote>
>Following conventional terminology, we say that I satisfies E if 
>I(E)=true, and that a set S of expressions (simply) entails E if 
>every interpretation which satisfies every member of S also 
>satisfies E.
></quote>
>EDITORIAL:
>Where is "expression" defined....?

Again, normal English usage. But it might be better to say 'graph', I guess.

>**** 2. Simple entailment between RDF graphs ****
>
><quote>
>The interpolation lemma completely characterizes simple RDF 
>entailment in syntactic terms.
><snip/>
>The existence of complete subgraph-checking algorithms also shows 
>that RDF is decidable, i.e. there is a terminating algorithm which 
>will determine for any finite set S and any graph E, whether or not 
>S entails E.
></quote>
>EDITORIAL:
>This argument to prove decidability (above lines chopped) is correct 
>but looks like quite an overkill for the reader.
>Simple model checking suffices to prove decidability here,

  I guess I assumed that more readers would be familiar with the idea 
of subgraph algorithms than with things like model checking.

>once noted finite-domain reasoning can be applied.

Why finite-domain reasoning?? IR isn't required to be finite.

>
><quote>
>If an RDF document is asserted, then it would be invalid to bind new 
>values to any of its unlabeled nodes, since (by the anonymity 
>lemmas) the resulting graph would not be entailed by the assertion.
></quote>
>EDITORIAL:
>What are the anonymity lemmas? They never appeared in the doc yet, 
>and there's no forward reference.

Whoops, sorry: left over from editorial revisions. I will correct this.

>
>
>
>**** 2.1 Criteria for non-entailment ****
>
>EDITORIAL:
>Do we really need this complex subsection in the main spec, and not 
>just as an appendix?
>It doesn't seem to give any mainstream contribution (even, it has to 
>introduce yac (yet
>another concept), lean graphs, just to prove the lemmas, which are accessory.
>So, the added value in the normative main text is probably not worth 
>the complexity this adds
>to the reading.

Possibly. I could put all this into the appendix and just have a few 
examples and remarks in the main text.

>
><quote>
>We emphasise again that these results apply only to simple 
>entailment, not to the namespace entailment relationships defined in 
>rest of the document.
></quote>
>EDITORIAL: this tends to be very confusing. It'd be much better to 
>explicitly write "simple entailment" in all
>occasions where a statement doesn't hold for all other entailments 
>defined in the spec.

Right, that might be better. I was trying to keep the thing more 
readable, but maybe it would be better to be exact. I will correct 
this.

>
>**** 3.2.1 Reification ****
>
><quote>
>The intended interpretation of these are that a triple of the form
>
>aaa [rdf:type] [rdf:Statement] .
>
>is true in I just when I(aaa) is an RDF triple in some RDF document.
></quote>
>EDITORIAL/WRONG:
>What does this mean? (formally, nothing...).

Well, no, the point is that it does mean that certain entailments are 
false that would be true in the other interpretation. Since the WG 
spent a GREAT deal of time getting this sorted out, it is important 
that the decision be recorded, and the MT document seems like the 
place to record it. I agree it is rather a detour from the main MT 
development, but that reflects the fact the RDF reification is a 
crock of s**t;  which isn't our fault, but is a fact that we have to 
face up to rather than ignore.

>It'd be better rephrased or omitted.
>
><quote>
>Let us call the node which is intended to refer to the first triple 
>- the blank node in the second graph - the center of the 
>reification. (This can be a blank node or a uriref.)
></quote>
>EDITORIAL:
>This goes on with some formal confusion. It should just be explained 
>what a reification
>is, using a generic node (blank or uriref). As it is now, saying the 
>"blank node" is the
>center, and then saying it can be blank or uriref is formally confusing.

I agree, and will rewrite.

>
>
>**** 3.2.2 Containers ****
>
><quote>
>RDF does not support any entailments which could arise from 
>re-ordering the elements of an rdf:Bag.
><snip/>
>Notice that if this conclusion were valid, then the result of 
>conjoining it to the original graph would also be a valid 
>entailment, which would assert that both elements were in both 
>positions. (This is an consequence of the fact that RDF is a purely 
>assertional language.)
></quote>
>
>ISSUE:
>This amounts to drop an important functionality that is part of the 
>normative RDFM&S spec.

Take this issue up with the WG, not with me.

In my view, this functionality never was in RDF: it is a fantasy 
based on a misunderstanding about the nature of assertional languages

>This is not documented anywhere in the RDF Issue List, cf 
>http://www.w3.org/2000/03/rdf-tracking/ .
>So, why rule out entailments like the one cited in the spec, cf
><quote>
>_:xxx [rdf:type] [rdf:Bag] .
>_:xxx [rdf:_1] <ex:a> .
>_:xxx [rdf:_2] <ex:b> .
>
>does not entail
>
>_:xxx [rdf:_1] <ex:b> .
>_:xxx [rdf:_2] <ex:a> .
></quote>
>...??

The wording you cite tries to explain why. If

_:xxx [rdf:type] [rdf:Bag] .
_:xxx [rdf:_1] <ex:a> .
_:xxx [rdf:_2] <ex:b> .

entails

_:xxx [rdf:_1] <ex:b> .
_:xxx [rdf:_2] <ex:a> .

then it also must entail

_:xxx [rdf:type] [rdf:Bag] .
_:xxx [rdf:_1] <ex:a> .
_:xxx [rdf:_2] <ex:b> .
_:xxx [rdf:_1] <ex:b> .
_:xxx [rdf:_2] <ex:a> .

and by suitable reordering, it will entail that ALL members of the 
bag are in ALL positions.

All RDF containers have an inherent ordering on their members that is 
fixed by the very syntax of the membership properties. If you want 
RDF bags to be really bags, then you must re-design the container 
membership function suite. You can't have unordered containers 
accessed with ordered membership functions in an assertional 
language; the very act of saying they are in the container assigns 
them a place in the ordering, and there is no way to 'un-say' it.

>I think DanC brought this issue of mine to RDF Core some time ago, 
>but as said, I can't
>find anything in the issues list.

Its in the email record somewhere (don't have web access to find it 
right now, sorry.)

>Mmm, let's use the new cool W3C search feature... gotcha, here it is 
>DanC's notice:
>http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Nov/0085.html
>I tried to follow the thread but can't find any resolution...
>Note also that similar reasoning applies to rdf:Alt .

Indeed. rdf:Alt has other problems, however.

Pat
Received on Monday, 13 May 2002 11:32:48 UTC