Comments on the new RDF Model Theory spec [non-editorial part] from Massimo Marchiori on 2002-05-22 (www-rdf-comments@w3.org from April to June 2002)

From: Massimo Marchiori <massimo@w3.org>
Date: Wed, 22 May 2002 12:38:36 -0400
To: phayes@ai.uwf.edu, www-rdf-comments@w3.org
Cc: massimo@w3.org
Message-Id: <200205221638.MAA16873@tux.w3.org>
**** NON-EDITORIAL PART
**** After
**** http://lists.w3.org/Archives/Public/www-rdf-comments/2002AprJun/0093.html


> ><quote>
> >An RDF literal has three parts ( a bit, a character string, and a 
> >language tag), but we will treat them simply as character strings, 
> >since the other parts of the literal play no role in the model 
> >theory.
> ></quote>
> >EDITORIAL/WRONG:
> >I hope this sentence is going to change, and it's just part of this 
> >version of the draft, as of course there
> >has to be a formal definition of what a literal is
> 
> Agree...
> 
> >(and, the MT is the place where it has to be!
> 
> ....disagree. The proper place is the syntax document, which was not 
> completed when this draft of the MT was written. This triplet 
> structure of literals plays no role in the MT and has no relevance to 
> it.
>
> >). Saying
> >the other parts "play no role" is confusing (and, formally, wrong), 
> >so please in any case state it better.
> 
> I will try to say it better, but in fact it is not wrong as stated. 
> The other parts of the literal have no effect on any truth-values of 
> any triples.

No, this has to be said clearly in the MT. 
[warning note: we might be saying the same thing here]
For example, how do you say whether two literals are the same? 
[incidental note: the fact this doesn't change the truth value means little.
If you build an MT where every literal is mapped to "Rome", then 
you don't change any truth value, but.. are you doing the right thing?
No, because you're cutting out intuitively plausible models, where
"Athens" and "Vienna" possibly denote different things. ]
The MT specifies the meaning of a syntactic piece of RDF. Just mentioning the 
string part means that RDF applications don't "see" any other information
other than the string, which means that, de facto, all the other
info in the literal is virtually useless.
So, we might be saying the same thing here (you don't want to write
it, because you mean there's no semantics for it), and it's a fine
choice. But then, no ambiguities or confusion should arise here,
and other literal components first introduced and then swept under
the rug, because we risk the usual interpretation (and *interoperability*)
problems common of RDF "as it was" ...).
So, ISSUE:
If there's no use for the other literal information in the semantics,
then take it out (what is its need..?????). If there's some use,
put it in the semantics (the graph). In any case, be razor clear.



> ><quote>
> >Two RDF documents, in whatever lexical form, are syntactically 
> >equivalent if and only if they map to the same RDF graph.
> ></quote>
> >EDITORIAL/WRONG: This is a definition that is never used later, so 
> >you might consider to drop it.
> 
> It isn't a definition, but a comment. The MT itself does not refer to 
> any RDF syntax other than the graph itself.
> 
> However, I had thought that it was in fact correct.
> 
> >But if you don't,
> >please note that this is likely wrong as written here: this is due 
> >to the fact the syntax -> graph is a relationship
> >and not a map.
> 
> Can you expand on this point? You are the first person to make this 
> claim, and I would like to get this point clear
> 
> >What you mean is probably to say they map to "equivalent" RDF graphs 
> >(meaning, semantically equivalent).
> 
> No, I meant it in the strict syntactic sense.

This is related to some other points we'll see later, so I'll just put
it down later.

> >Moreover, you should add the finiteness condition: An RDF graph is a 
> >*finite* set (or multiset..) of triples.. etc.
> 
> There is no finiteness condition, deliberately. Imposing it would 
> unnecessarily complicate the definitions of entailment, and serve no 
> useful purpose. Several of the closures defined later involve 
> infinite sets of triples.

This is a minor point, but anyway: the fact one uses infinite triples
in a proof, doesn't imply one has to use infinite triples
in the definition of RDF graph. The moment someone has to state the
computability properties of RDF graphs, you'll then be forced to
always state "finite RDF graph" to get that RDF is computable (ie that
people can effectively do stuff with it). Isn't it strange?
Instead, the other way around you get that all RDF graphs, RDF entailment etc
are computable (as people expect...), at the price of slightly complicating
the proof (but who cares? As, people use RDF graphs, not the proof...).




> ><quote>
> >In particular, two N-triples documents which differ only by 
> >re-naming their node identifiers will be understood to describe 
> >identical RDF graphs.
> ></quote>
> >EDITORIAL/WRONG:
> >This is formally wrong.
> 
> No, it is exactly and formally correct. This is an important point; 
> the blank nodes in the graph really are blank. They are not 'hidden' 
> names.
> 
> >Here you should not say "identical" RDF graphs, but rather, define 
> >equality on graphs
> 
> Formally, a graph is a set of triples, and that formally defines 
> equality unambiguously: same set, same graph.
> 
> >where
> >blank nodes renaming
> 
> That phrase is meaningless. Blank nodes do not have names and cannot 
> be re-named.
> 
> >can occur, and then use this new equality definition when needed.

This comes from the "no free meal" principle.
The only thing that matters here is the definition of RDF graph.
And as that's formally done using N-triples, you have been forced
to give away the node identity that you get for free from the graph, 
and instead use objects from a set:
<quote>
(unlabeled) nodes are considered to be drawn from some set of 'anonymous' entities
</quote>
The moment you do this, you are introducing "hidden names", and you are forced
to do the renaming (actually, "remapping" is maybe more appropriate) using
an injection (i.e., an identity-preserving mapping).
You can see this was evidently a confusing point as in the rest of the definition
the same sentence above proceeds with 
<quote>
which have no label and are unique to the graph
</quote>
which is in one of the editorial comments, as the two graphs (N-triples, the
formal one, and the pictorial one) are confusing with each other...

Back to the previous comment on "syntactical equivalence" then: formally, the 
syntax -> graph is not a map but a relation, because you need to pick up
for every "anonymous node" (pictorial graph) some representative element in the 
set of anonymous entities (N-triple level), and so in general you have infinite 
RDF graphs corresponding to one syntax.


> ><quote>
> >The result of taking the set-union of two or more RDF graphs (i.e. 
> >sets of triples) is another graph, which we will call the merge of 
> >the graphs
> ></quote>
> >WRONG:
> >This is formally wrong, and contradicts what said after this 
> >sentence (the fact blank nodes are not merged).
> 
> No, it is formally correct and does not contradict it. Read the 
> definitions carefully.

Aligned with what above. No free meal: in the set union, you have to remap
the "anonymous entities" accordingly.

> >Formally define the real merge operation (and if the case, note just 
> >the opposite of what written here,
> >i.e. the fact the subset relationship can not hold any more when 
> merging).
> 
> I do not follow you. Formally, an RDF graph is a set (of triples). 
> The merge is the union set. How can a subset relation not hold 
> between them?

Ditto. The merge is *not* the union set with the current definition
of RDF graph.

> ><quote>
> >and that a graph is an instance of another just when every triple in 
> >the first graph is an instance of a triple in the second graph, and 
> >every triple in the second graph has an instance in the first graph.
> ></quote>
> >WRONG:
> >The substitution of blank nodes must be well-defined thru the whole 
> >graph (so, respecting at least node
> >identities). The way you define it (triple by triple) is incorrect.
> 
> No, it is correct.  The same blank node may occur in several triples, 
> and substitution is well-defined throughout the graph. Your objection 
> would hold if a graph were an N-triples document.

Re-ditto. With the present definition,

_:xxx <ex:a> <ex:b>
_:xxx <ex:c> <ex:d>

and

_:yyy <ex:a> <ex:b>
_:zzz <ex:c> <ex:d>

are instances of each other (!)


> ><quote>
> >The intended interpretation of these are that a triple of the form
> >
> >aaa [rdf:type] [rdf:Statement] .
> >
> >is true in I just when I(aaa) is an RDF triple in some RDF document.
> ></quote>
> >EDITORIAL/WRONG:
> >What does this mean? (formally, nothing...).
> 
> Well, no, the point is that it does mean that certain entailments are 
> false that would be true in the other interpretation. Since the WG 
> spent a GREAT deal of time getting this sorted out, it is important 
> that the decision be recorded, and the MT document seems like the 
> place to record it. I agree it is rather a detour from the main MT 
> development, but that reflects the fact the RDF reification is a 
> crock of s**t;  which isn't our fault, but is a fact that we have to 
> face up to rather than ignore.
> 
> >It'd be better rephrased or omitted.

I see (and I had admittedly a good laugh at the crock part ;).
But the way it's written now it's confusing and, still, formally
wrong. Saying the above triple is "true in I" , then adding
"when I(aaa) is an RDF triple" and going on with "in some RDF documents"
makes really no sense (doesn't it?)
Some explanation could just say, always formally, that reification is in fact
an operation that taken some triple T, produces a set of triples R(T) that 
doesn't entail T (at minimum, as there are other properties, but yes, of
no interest in the present MT conext). And 
Then anybody can argue on the its usefulness or not... ;), as anyway the rest
of 3.2.1 does a good job in explaining why this is of little use in the 
MT at the present time.


Thanks,
-M
Received on Wednesday, 22 May 2002 12:38:41 UTC