review of July 15 draft of RDF Semantics document

As I had received information from the RDF Core WG that the RDF Semantics
document was suitable for review, and I needed to see if my many concerns
with the RDF model theory have been resolved, I did a pass through the July
15 draft of RDF Semantics.

Unforunately, I found quite a number of problems with this draft.  Some of
these are problems remaining from previous versions of the document but
some of them appear to have been newly introduced.


Drastic Problem:

The treatment of XML Literals is inconsistent within the document and with
respect to RDF Concepts (at least the version of RDF Concepts that is
accessible through the pointer in the RDF Semantics document, there are
also broken links related to XML Literals).  The change list in RDF
Semantics says that XML literals ``are now required to be in canonical form
and therefore to denote their own literal string.''  This appears to mean
that XML literals are just a subset of character strings.  This is
completely counter to what is said in RDF Concepts.  Section 3 of RDF
Semantics has no mention of the fact that XML literals denote themselves.
It also says that is ``is deliberately agnostic as to whether or not XML
data is considered to be identical to a character string'', which is in
direct contradiction to the wording in the change list.

XML Literals have been a source of very many problems.  As they are still
not correct, it would be much better to just dump them entirely.


Drastic Problem:

There has been a significant conceptual change to simple interpretations.
IP is not required to be a subset of IR.  This does not appear to be in
response to any comment to the RDF Core Working Group nor to be in response
to any problem with the RDF model theory.  This change may have
consequences for other formalisms, including OWL, but no announcement about
it has been made.


Problem:

The definition of a proper instance admits a switch of blank nodes in the
graph, e.g., replacing _:a with _:b and vice versa, as a proper instance,
but this shouldn't be a proper instance.  

This invalidates the anonymity lemma, as 
	_:a <ex:p> _:b . 
is a proper instance of itself and lean, so should not entail itself.


Problem: 

The example of a lean graph is not lean, as the instance of this graph
obtained by replacing _:x with <ex:a> is a proper instance of the graph.
This calls into question the entire notion of lean graphs.


Problem:

The definition of the merge of a set of graphs is inadequate.  Just which
blank nodes of members of S are to be replaced?  From the definition, the
merge of 
	_:a <ex:p> _:b .
and 
	_:a <ex:p> _:c .
and
	_:b <ex:p> _:c .
could be
	_:a <ex:p> _:b .
	_:a <ex:p> _:c .
	_:e <ex:p> _:e .
as this ``replaces blank nodes in some members of S by distinct blank
nodes''.   There are other problems in the definition of the merge as well.


Problem:

In Section 1.3 a vocabulary is defined as a ``set of URIrefs''.  
However, in the change log and in Section 0.3, a vocabulary is supposed to
be able to contain typed literals.


Problem:

There is no definition of a ``literal character string'' or a ``language
tag'', used in the definition of simple interpretations.


Problem:

It is not the case that ``any URIref which occurs both as a predicate and
as a subject in any triple must denote something in the intersection of IR
and IP.''


Problem:

The conditions for denotations should be augmented with more conditions
like ``if I(p) is in IP''.    I suggest adding as well ``if s, p, and o are
in V''.


Problem:

The example in Section 1.4 is incomplete in that it does not define LV.
Also, IL is necessarily the empty map as there are no typed literals in the
vocabulary of the example.  This makes the fourth triple false, not true.

The ``oddity'' of having a typed literal denote a non literal is not ruled
out in datatyped interpretations.

The explanation of why triples involving plain literals are false is
incomplete, as plain literals do not have to denote character strings.


Silliness:

rdf-interpretations do not just ``impose extra semantic conditions on crdfV
and typed literals with the type rdf:XMLLiteral''.  Why not just say that
rdf-interpretations impose extra semantic conditions?


Problem:

The vocabulary of an interpretation contains no ``well-typed XML
literal string''s, so the definition of rdf-interpretations is
suspect, at best.  Also, there is no definition for ``well-typed XML
literal''. 


Problem:

The document states several times that it is agnostic as to whether XML
literals are strings.  However, the claimed completeness of the RDF entailment
rules means that XML literals are not strings.


Problem:

The treatment of quoted strings in LBase is so bad that I can't even begin
to figure it out.  However, it is definitely the case that the translation
to LBase changes the denotation of character strings.  Whether this causes
problems I cannot determine.


Problem:

The translation to LBase seems to assume in some places that LBase uses
URIrefs of some sort, e.g., the expansion of Lbase:String.  However, the
LBase document itself uses non-URIref names for these things, e.g., String.

Problem:

The translation to LBase ignores some of the aspects of URI references, I
believe.  In particular, I believe that RDF URI references can include
whitespace, which is not allowed in LBase names.  I note also that LBase
doesn't even bother to define character strings.

Problem:

The translation to LBase can be broken by use of suitable URI references in
the RDF graph.  For example the translation of

	ex:a rdf:type LBase:String .

would imply the translation of

	ex:a rdf:type rdfs:Literal .

which is not a valid rdfs-entailment.

	
Problem:

The translation to LBase does not require the correct treatment of XML
literals.  XML literals are only handled in LBase translations of
D-interpretations. 




Question:

Does 
	<ex:a> <ex:b> "a"^^xsd:string .
xsd-entail
	<ex:a> <ex:b> "a" .
or not.

An answer to this questions are needed for the RDF semantics to be
complete.  It should also be a test case.


Typos:

Section 1.3	etc..
Section 2	a set S of [graphs] (simply) entails a graph E



Peter F. Patel-Schneider
Bell Labs Research
Lucent Technologies

Received on Wednesday, 23 July 2003 13:24:40 UTC