Re: model theory for RDF/S from Pat Hayes on 2001-09-26 (www-rdf-logic@w3.org from September 2001)

From: Pat Hayes <phayes@ai.uwf.edu>
Date: Wed, 26 Sep 2001 16:23:50 -0500
To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Cc: www-rdf-logic@w3.org
Message-Id: <p0510102db7d7e3d28edf@[205.160.76.184]>
>Good job. However,  I see several problems in the document.  Some of these
>I view as significant.

I agree, and thanks for the detailed work.

>
>1/ The document does not sufficiently pin down the relationship between
>    literal values and resources. 
>
>    The model theory does not require that literal values and resources be
>    disjoint.

Right. It is deliberately agnostic on this issue.

>Further, an RDF graph, as defined in Section 0.2, allows the
>    head of edges to be labeled with literals.

Ah, that is indeed a slip. The N-triples notation does not permit 
this, and we should have ruled it out form the RDF graph description 
also.

>  This means that there is no
>    significant difference between resources and literal values, nor between
>    URIs and literals.
>
>    The document, as it will probably be read, strongly suggests that
>    literal values and resources are disjoint.  In Section 1.3, IEXT is a
>    map into IR x (IR u LV), which will probably be read as requiring that
>    LV and IR be disjoint.

Well, section 1.1 has an explicit warning about not making that 
reading, which is definitely not intended. I guess I could say it 
again in section 1.3.

>    I think that the document needs to come down firmly on one side or the
>    other.

I disagree. The model theory does not depend on the resolution  of 
this debate, and will work unchanged in either case, so it is best 
left open.

I concede that the wording of the text could emphasize this point a 
little more clearly.

>Either it has to state that it is extending RDF to allow
>    literals to be the subject of predicates, or it has to exclude the
>    possibility of literals being the subject of predicates.

It is intended to conform to current RDF, where literals are not 
allowed to be subjects, but purely as a syntactic restriction. The 
model theory would apply to the more general case just as well. I 
think it is important to be clear when a restriction is not imposed 
by any semantic interpretation.

>
>2/ The model theory does not assign meaning to RDF graphs.  Instead it
>    assigns meaning to RDF graphs with an extra, nowhere-defined attribute
>    on edges.  If this is supposed to be a model theory, then it should be
>    rigorous, and not have any undefined junk floating around.

I do not follow this point. What attribute on edges?

>
>3/ The model theory maps literals into literal values and URIs into
>    resources when it should be mapping nodes.

Yes, you are right. To be absolutely rigorous, many references to 
literals (not literal values) and URIs should speak of the node 
labelled with the literal or URI. However, nodes are 1:1 with their 
labels in any tidy graph, so I used a slight abuse of terminology to 
keep the descriptions shorter. I should put a note in the text 
somewhere acknowledging this.

>4/ The model theory breaks down for edges whose edge label does not map
>    into a property as IEXT is only defined on properties.

It doesn't break down; it assigns false to any such triple, by the 
third rule in 1.3

>5/ The model theory allows multiple URIs to map to the same resource.  I
>    applaud this feature of the model theory, but it should be noted in the
>    document outside of examples.

I wasn't aware that this was a feature. If it is worth remarking on, 
we could remark on it.

>6/ The definitions of IEXT and IS should use the same notation.

As a matter of good style? OK, though you are the only one who has 
commented on this.

>7/ The mapping of unlabeled nodes is to the ``domain'' of interpretations,
>    but interpretations don't have domains.

IR is defined to be the "domain or universe of the interpretation" 
(section 1.3).  This usage is common in model theory, but to avoid 
confusion I will change the terminology in the next version.

>Later on the mapping for
>    unlabeled nodes is restricted to map to IR, which makes sense.
>
>8/ In the model theory for RDFS, there is the requirement that all literals
>    have rdf:type of LITERAL.

I hope not. I was careful to not include that case. I think you will 
find that the RDFS rule table makes no mention of literals other than 
to exclude them from some cases.....

>This now requires that literals be allowed in
>    the subject position of predicates,

...for just that reason.

>which is forbidden in M&S

Right. Which makes all uses of(....  rdf:type rdfs:Literal .) either 
false in RDFS or ill-formed in RDF, which is why I decided to ignore 
them. I should have stated this explicitly in the text.

>and in the
>    document as probably being read, but not forbidden in the model theory
>    itself, although it has the effect of making all literals be resources.

That is not intended.

>
>9/ The model theory for RDFS is missing the requirement that the vocabulary
>    contain all the RDFS ``pre-defined'' URIs.

Right. That seemed to not be a model-theoretic matter, to me. The MT 
assigns meanings to the triples it finds, but does not impose any 
requirements on what triples must be present. But on reflection, it 
maybe would be more coherent, and ultimately simpler, to insist that 
they be included.

>
>10/ The model theory for RDFS is confusing with respect to the status of
>     ICEXT.   ICEXT is not included in an interpretation, but appears
>     prominently in the interpretation conditions.  Now it is not strictly
>     necessary to have ICEXT be part of an RDFS interpretations, as implied
>     in the document, but if so, then it would be much better to write the
>     conditions without ICEXT (and also remove the condition that becomes
>     vacuous).

I agree we could eliminate ICEXT altogether. We debated this, and 
some members of the WG argued for it to be eliminated, and the entire 
model theory for RDFS stated in terms of the relational extension of 
rdf:type. I feel that having classes explicitly named makes the model 
theory easier to understand and follow intuitively, however, and the 
text draws attention to the fact that the semantic interpretation 
rule for rdf:type amounts to a definition of ICEXT in terms of IEXT, 
and that the whole of RDFS can be seen as an RDF theory of rdf:type.

>
>11/ The RDFS conditions are missing the fact that many ``pre-defined'' URIs
>     belong to IC.

See comment on 9/

>12/ The RDFS conditions are missing the range restriction on rdf:type.
>     Without this restriction, ICEXT is not simply a convenience nor is
>     rdfs:Resource an rdfs:Class in the model theory.

Whoops. This?

rdf:type rdf:range rdfs:Class .

Indeed, there seems to be no way to generate that, which should of 
course be in every closure. Damn, you are right. And of course also:

rdfs:subClassOf rdf:range rdfs:Class .
rdfs:subClassOf rdf:domain rdfs:Class .
rdf:range rdf:domain rdfs:Property .
rdf:range rdf:range rdfs:Class .
rdf:domain rdf:domain rdfs:Property .
rdf:domain rdf:range rdfs:Class .

This would then simplify the table, since rule 9 would follow from 5 
and 6. (Some of these may be derivable from others, I havn't checked 
in detail. Dont think so, though.)

However, I think that is all that we need. Do you see any others 
(bearing in mind that we are ignoring the class rdfs:Literal as 
unusable) ?

>13/ Many domain and range properties are missing from the RDFS
>     interpretation conditions.

Have I got them all above?

>14/ The RDFS schema-closure rule 1a has the effect of making all literal
>     values be resources as all literal values have an rdf:type property
>     because they are in the class extension of rdfs:Literal.  This is valid
>     in the RDFS model theory, as all literals must be resources there

That is not my understanding.

>, but
>     is probably not expected.

And it shouldn't be assumed to follow.  See comments on 8/

>
>15/ The RDFS schema-closure rule 1c also has the effect of making most
>     literal values be resources.

I fail to see why or how, since that rule explicitly excludes literals.

>  This is a valid rule, because all
>     literals are already resources, but is probably not expected.
>
>16a/ Because of the missing domain and range properties the RDFS
>     schema-closure rules 9a and 9b, are not valid.  The RDFS schema-closure
>     rules are only valid because only properties have IEXT mappings.
>
>16b/ Because of the missing range property on rdf:type, RDFS schema-closure
>     rule 7 is invalid.
>
>16c/ Because of the missing range property on rdf:type, all resources are
>     subclasses of rdfs:Resource.
>
>16.../  There are probably other consequences of the missing domain and
>     range properties.

OK, OK. I will fix this asap. See above.

>17/ Because of the complexity of RDFS, I won't believe the Schema Lemma
>     until I see a completely worked out proof.

Fair enough. I no longer believe it myself. What I am sure of is that 
there is *some* closure table for which it is correct, however. Also, 
it should be stated so as to explicitly rule out the rdfs:Literal 
class.

>There is a typo in Figure 1.  It should say IEXT(1) instead of IEXT(I).
>
>There is a typo in the RDFS conditions.  They should say
>IEXT(I(rdfs:subPropertyOf)) instead of IEXT(rdfs:subPropertyOf).

ok, thanks.

Pat

-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Wednesday, 26 September 2001 17:23:49 UTC