heads-up on planned changes in MT

Apart from fixing things like missing references, unclear 
explanations, typos, etc., there are some fairly big changes I 
propose to make in the next version of the MT. I'd like to get this 
out as quickly as I can so that people don't start making commitments 
based on the things that are going to change. This is a summary of 
those changes, to help the WG follow what is going on.

Most of these have arisen from discussions with Peter 
Patel-Schneider, who deserves a lot of credit (or blame, if you 
prefer :-). I've taken the liberty of CCing this message to Peter. 
(Peter, please reply to me rather than the entire WG, thanks.)

----------------------------

1. Definition of RDF graphs.

The current text refers to 'a partially labelled graph', which is 
potentially misleading, since in many mathematical contexts a 'graph' 
is defined in such a way that there can be at most one edge between 
any pair of nodes. Peter has suggested a very strict mathematical 
style be adopted, but I propose to clarify the English wording in the 
main text, and include a painstakingly precise 'mathematical' version 
in the appendix.

-----------------------------

2. Dividing the vocabularies between RDF and RDFS . (Big one, this. 
It kind of grew as I was typing it.)

Currently, RDF interpretations are defined on an arbitrary 
vocabulary; but Peter has pointed out that the M&S requires that any 
RDF model include rdf:type as an rdf: Property, so this should be 
included in the RDF (not the RDFS) model theory.  Thus, V is required 
to contain at least 'rdf:type', and the semantic equations include 
the requirement that I(rdf:type) is in IP.

On thinking about this, I am inclined to go further, in fact, and 
require that *all* the rdf:-prefixed vocabulary is included in the 
vocabulary of any rdf interpretation, ie all of
rdf:type  rdf:Property, rdf:domain and rdf:range,
together with the semantic restrictions on them, in effect pulling 
back some of the rdfs semantics back into rdf.  In general, every 
interpretation is allowed to assign special meanings to some 
namespace, and so entailment is always with respect to that reserved 
namespace. (eg rdf:, rdfs:, daml:, etc.) The relation called 'rdf 
entailment' in the current MT will be re-christened 'basic 
entailment', and it corresponds to an empty namespace.

What this suggests is doing the MT exposition in three 'stages' 
instead of two. Right now we have a kind of pure-rdf-logic notion of 
entailment, then all the special vocabulary is added in one lump in 
section 4.  I propose that we leave the pure-rdf-logic part as it is 
- this is the semantics of the empty namespace, common to them all - 
then add the purely RDF vocabulary and analyse what happens when it 
is added (not much, see below); then add the rest of the RDFS 
vocabulary, and analyse what happens then.

This would mean that rdf-entailment (as it will be called, ie 
entailment with respect to the rdf: namespace) will become a bit more 
like rdfs entailment is now; but only a bit like it. (Jos, sorry to 
make your life more complicated, but I think it will actually work 
out better for the entailment examples, in the end. )

The following are true in any rdf interpretation of the rdf 
vocabulary (three typings and two domainings:)

rdf:domain rdf:type rdf:Property .

rdf:range rdf:type rdf:Property .

rdf:type rdf:type rdf:Property .

rdf:domain rdf:domain rdf:Property .

rdf:range: rdf:domain rdf:Property .

We can't say anything about the ranges of rdf:domain and rdf:range, 
or about the domain or range of rdf:type, or the type of 
rdf:Property, since we don't (yet) have the notion of a class; they 
are just resources at this stage, and we don't have any way to even 
say that.

The closure rules would be 1b ,5 and 6 from the current rdfs table

xxx aaa yyy .   --->    aaa rdf:type rdf:Property .

xxx aaa yyy .   &   aaa rdf:domain zzz .    --->   xxx rdf:type zzz .

xxx aaa uuu .   &    aaa rdf:range zzz .    --->   uuu rdf:type zzz .

Until we get into RDFS, there are no other semantic constraints on 
IEXT(I(rdf:type)) and so assertions made with rdf:type will have no 
other special entailments.

What these rules will do is quite simple. They take every occurrence 
of a property in any triple and 'decorate' it with the observation 
that it is an rdf:Property, and if anything is said about its domain 
and range, then they will say the appropriate thing about the subject 
and object, all using rdf:type; but that is all. They will increase 
the size of a graph with N triples by at most (N + 5) triples, all 
but two of the form
foo rdf:type baz .

This makes the contrast with RDFS more vivid, I think, and 
illustrates more forcibly the extra power that one gets from the 
notion of subclass. (Also, by the way, I think that the 'constant' 
triples for the rdfs closure, ie the ones that must be in any 
closure, can now be got by applying the rdf closure rules to the rdfs 
reserved vocabulary, which is very cute.)

Similarly, the vocabulary of any RDFS interpretation will be required 
explicitly to contain all of
rdfs:subClassOf, rdfs:subPropertyOf, rdfs:Class, rdfs:Resource and 
rdfs:Literal, and they will be considered to be 'rdfs-reserved' URIs. 
This is not made explicit currently.

Overall, this will have more effect on the appearance and 
presentation of the MT document than on the actual content, in fact; 
but I think it is important, as it will make the relationship between 
RDF and RDFS both clearer and more, as it were, intimate. It also 
gives the IP part of the rdf interpretations something to do in RDF. 
This idea of having a reserved namespace which has a fixed semantics, 
and entailment in general being relative to such a reserved 
vocabulary, is a generally useful idea in any case, and it would be 
good to introduce it into the MT document explicitly. It gives a 
quite vivid picture of 'basic' rdf inference and semantics as a kind 
of base or core on which everything else is built, as well, which is 
very SW-PC, right?

The contents will read something like this:
0. as now,
1. as now, with slight terminology change from 'rdf interpretation' 
to simply 'interpretation', emphasizing these are interpretations of 
the rdf graph syntax but make no assumptions about vocabulary..
2. as now
3. as now, with similar terminological care. (Distinguish 'simple 
entailment' from 'rdf entailment' and 'rdfs entailment')
4. Interpretations of RDF vocabulary (introduces notion of 'reserved' 
vocabulary, ie with special meaning assigned by the semantics, rdf: 
prefix, and simple semantic conditions.) (It might be fun to take the 
example in figure 1 and show how it gets expanded...I'll see if that 
is feasible.)
5. RDF entailment (like current section 5, but relates rdf entailment 
to simple entailment using  rules as above, with numerical analysis 
sketched out.)
6. Interpretations of RDFS vocabulary (extends 4. to full rdfs, with 
extra semantic equations, introduces ICEXT.) (May add some extra 
commentary on how this can all be said in terms of rdf:type, but only 
by using implication, which motivates the rules coming up:)
7. RDFS entailment (like current section 6, but obviously corrected. 
text will point out how the new rules interact with the limited rdf 
rules in section 5 and extend their power, with some simple examples.)
8. Containers (very brief, as now; but stated in terms of another 
vocabulary extension; the point here being that all the needed 
semantic restrictions can be stated axiomatically in RDFS.)

----------------------------------------------
3. Change to rdfs:Literal

The current MT has the semantic rule:

ICEXT(I(rdfs:Literal)) = LV

I propose to weaken this to

ICEXT(I(rdfs:Literal)) is a subset of LV

Contrary to appearances, this brings this condition more into line 
with the corresponding condition for rdfs:Resource:

ICEXT(I(rdfs:Resource)) = IR

since IR is only required to be *a* set of resources (not the set of 
*all* resources.)

The technical motivation for this arises from Peter's observation 
that the truth-condition for rdf:type entails that IR  must contain 
every member of the class extension of every named rdfs class, so the 
mere presence of rdfs:Literal in the vocabulary as an RDFS class 
requires that every member of ICEXT(I(rdfs:Literal)) is in IR; so if 
applied with the current MT condition on rdf:Literal, it has the 
unfortunate consequences that all literal values must be in the IR 
universe of every interpretation (so all literal values must be 
resources), and that all rdfs closures would be infinite.

This change has  rdfs:Literal  being treated in a special way; but in 
fact this makes sense, since the set LV  of literal values has a 
special status in that it is defined 'outside' of the interpretation, 
so it seems unreasonable to require all interpretations to contain 
all of it. It retains the agnosticism about the relationship between 
IR and LV  that the model theory was intended to have.

--------------------------------------------
4. rdfs closure rules

As previously noted, several important cases involving domain and 
range were omitted from the list of 'required' triples in rdfs. (Some 
of these are now in rdf).  When added, these make some of the table 
entries redundant. I will have a detailed correction/revision to this 
whole thing ready by Monday, I hope. However, the general approach of 
defining rdfs entailment in terms of simple entailment on closures 
seems to be sound.

-------------------------------------------

Pat


-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes

Received on Friday, 28 September 2001 23:01:27 UTC