Re: checked RDF semantics for XSD stuff, couldn't grok namespace entailment from pat hayes on 2002-12-14 (w3c-rdfcore-wg@w3.org from December 2002)

From: pat hayes <phayes@ai.uwf.edu>
Date: Sat, 14 Dec 2002 01:31:34 -0600
To: Dan Connolly <connolly@w3.org>, bwm@hplb.hpl.hp.com
Cc: w3c-rdfcore-wg@w3.org
Message-Id: <p05111b18ba2072326e34@[10.0.100.86]>
Brian, Dan makes some good points that would be best fixed. How much 
of a disaster would it be if I were to give you a revised snapshot by 
say Monday EOW my time (evening your time)?? All the changes will be 
link-fixings and small text-edits, nothing earthshaking. I also plan 
to call out formal technical definitions, give them all anchors and 
put in links from every term use to a glossary entry or definition, 
throughout the text.

Pat
---------

>I noticed some places in earlier drafts
>of the semantics doc that should cite various
>XML schema specs. Since I've been dubbed
>"voice of the XMLSchema WG"[11Dec], I'm
>reviewing it today...
>
>http://www.coginst.uwf.edu/~phayes/RDF_Semantics_finalCall_1.html
>Fri, 13 Dec 2002 21:02:40 GMT
>
>[11Dec]
>http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Dec/0165.html
>
>In sum, the lack of citations has since been
>fixed (though I did run across a broken
>internal link, which see below).
>
>Some stuff I noticed along the way...
>hmm... is any of it critical? Well,
>I think the bit about semantic extension
>is CRITICALly unclear; I can't tell
>if it's coherent or not because I can't
>follow the pointers within the document.
>
>Comments in document order...
>most of these are editorial...
>I'll mark the ones that I think could
>be observed objectively (i.e. thru testing)
>as WRONG.
>
>
>| 0.1 Specifying a formal semantics: scope and limitations
>
>| that model theory might be better called 'interpretation theory'
>
>why is it not, then? [answer: cuz that's not
>what logicians call it. Maybe cite tarski
>or whatever the seminal work is, informatively?]

I'll try to work in a reference, but this could turn into a whole 
essay, and the remark was only intended to be helpful.

>
>| The restriction to a mononotic logic is a
>| deliberate design decision, however.
>
>is that design decision motivated in the concepts
>doc? (see my earlier message about scalability)

I'll search and put in a link if I can find anything.

>If so, perhaps a cross-reference?
>Of not, bummer.

Indeed.

>
>| and OWL [OWL,
>
>missing ]

Will fix

>
>| language of RFC 2119
>
>citation/pointer/href?

Whoops, there should be an explicit reference. Will fix.

>
>| All such extensions MUST conform to the semantic
>| conditions in this document.
>
>Hmm... not clear what conforms(semExt, thisDoc)
>means. CRITICAL.

I followed Jeremy's suggestions here. I am quite happy to take out 
all this RFC 2119 language if you prefer: I really don't quite know 
what the hell it is supposed to be saying, to tell you the truth.

Or, I could try SAYING what conforms(semExt, this Doc) means. I do 
that, actually, but it is rather throwaway and could be done more 
elaborately: "Any entailment which is valid according to the 
semantics described here MUST continue to be valid in any extended 
semantics imposed on an RDF namespace." In other words, it is a 
conformance requirement on semantic theories of semantic extensions. 
I realize this is unusual, but the definitions of the 2119 language 
seem to apply perfectly well to this case as well as to software 
specifications; and indeed RFC 2119 itself uses its own vocabulary in 
senses that go beyond purely software specification uses.

A better way to phrase the intent here might be to state it directly 
as a condition on semantic theories, like this:

"Any extended semantic theory imposed on a RDF semantic extension 
MUST be such as to define a notion of entailment which preserves 
validity as described here."


>| Any entailment which is valid according to the
>| semantics described here MUST continue to be valid in
>| any extended semantics imposed on an RDF namespace.
>
>What's an RDF namespace?

Good question. I was thinking of it simply as a set of names, ie a 
set of urirefs and/or literals.  How about if I just said 'RDF 
vocabulary'?

>  The term 'namespace' is
>actually not defined (at least, not to my satisfaction)
>in the XML Namespaces spec. Only the term 'namespace name'
>is. Maybe I didn't read enough of the document;
>but whatever I should have read should be forward-
>referenced.

The appropriate reference would be back to the notion of a 'name'. I 
will try to clean this up and avoid the term 'namespace' altogether. 
The XML usage is getting in the way here.

>
>
>| as introduced in section 3 below.
>
>no link?

Do we need links to numbered sections? BUt I will put one in.

>
>I can't find any occurence of 'namespace entailment'
>in section 3. CRITICAL.

I think this is an editing bug. At some point in the past I changed 
'namespace entailment' to 'vocabulary entailment', and some 
occurrences may be orphaned. I will go through and purge/rationalize 
this more carefully. Assume that the right term is 'vocabulary 
entailment' and that this will be fixed.

>
>
>| In the interests of brevity, we use the imaginary URI
>| scheme 'ex:' to provide illustrative examples. To obtain
>| a more realistic view of the normal appearance of the
>| N-triples syntax, the reader should imagine this replaced
>| with something like
>'http://www.example.org/rdf/mt/artificial-example/'.
>
>That's really wierd. Why make up an imaginary URI scheme?
>Instead of <ex:a>, why not write ex:a and pretend
>ex: is a namespace prefix, just like rdf: and rdfs:?

I did originally, but this usage was recommended by others in the WG.

>
>If ex: is really an imaginary URI scheme, then why
>replace it by 'http:...'?
>
>Well, after reading the rest, I see you do use
>ex: consistently as an imaginary URI scheme.
>Very well, then.
>
>
>| arcs are of course never merged.
>
>Why not? Merging
>{ <ex:a> <ex:b> <ex:c> } with
>{ <ex:a> <ex:b> <ex:c> } gives
>{ <ex:a> <ex:b> <ex:c> }, no?
>
>WRONG.

Indeed, should have been deleted in an earlier pass. Will do.

>
>
>| A name is a uriref or a typed literal.
>
>why not untyped literals? Oh... "We do not think of
>plain literals as names because their interpretation
>is fixed by the RDF semantic rules." Very well. thanks.
>
>| An instance of an RDF graph is, intuitively, a
>| similar graph in which some blank nodes may have
>| been replaced by urirefs or literals.
>
>At the risk of confusing natural language intuitions
>with formal specification, that would be a lot more
>clearly intuitive with an example ala:
>
>	An instance of "Some page was written
>	by Bob" is "Page47 was written by Bob."
>

OK, will do.

>| The use of the explicit extension mapping also makes it
>| possible for two properties to have exactly the same values,
>| or two classes to contain the same instances, and still
>| be considered distinct.
>
>strike 'considered'.

OK.

>
>| It assumes, implicitly, that urirefs have the same
>| meaning whenever they occur.
>
>I think that overstates the case; the semantics only
>assumes that each interpretation gives one denotation
>to each uriref; interpretations corresponding to different
>times might give different denotations to the same uriref.

No. How can an interpretation correspond to a time or the set of 
interpretations change with time? The truth conditions do not change 
with time. What you say only makes sense if interpretations are 
somehow defined to be temporally parameterized, but they aren't.

>
>WRONG.
>
>(this would need a different sort of test than
>the ones in our test document, but I think it
>could be observed objectively.)

I would like to see it. Heres an argument why not: suppose that 
something is entailed by a graph at one time. Then it is entailed by 
the same graph at any other time, since entailment makes no reference 
to time (same applies to place, manner, person thinking about the 
entailment, whatever). Any graph entails itself. So, any graph at one 
time entails the same graph at any other time.

>
>| We do not make any assumptions about the relationship
>| between the denotation of a uriref and a document or
>| network resource which can be obtained by using that uriref
>| in an HTTP transfer protocol.
>
>Again, that overstates the case. 'We' the formal semantics
>editor don't make any such assumption.

Right, that is what is meant. Restate this to say 'the formal 
semantics does not assume any. particular relationship....'

>But we the RDF
>Core WG do expect that URIs will usually be used in RDF
>consistently with their use in HTTP, HTML and other conventional
>contexts.

Do we?? That is news to me. I was under the very strong impression 
that we were not making that assumption, in fact, and that the use of 
urirefs in RDF was not at all aligned with their use in HTTP or HTML. 
There is nothing anywhere in RDF that assumes that a uriref has 
anything at all to do with whatever happens when you use that uriref 
in an HTTP protocol. We have discussed this issue at length several 
times and nobody has ever suggested otherwise.

>This is what the intro to the semantics says;
>directly only briefly, but indirectly thru the concepts
>doc more elaborately.

In that case, it seems to me that the docs as a whole say that RDF 
does not in fact mean what the semantics doc says it means. If that 
is really the case, then we should say so very clearly, indeed. 
However, I wish that you had mentioned this earlier. This wording has 
been in the semantics document unchanged now for many months, through 
many edits and rewrites, and you have not had any problems with it 
before. And in fact the reason it is in there is because you, Dan, 
told me explicitly to NOT try to take account of stuff like this when 
doing the MT, but to treat urirefs as simple logical names.

>
>| It has been argued that urirefs in the form of HTTP URIs
>| should be required to denote the document that results from
>| such a retrieval.
>
>I don't know why you point out that misconception;
>the more architecturally consistent view is that
>it denotes a thing that, when sent GET messages,
>sends you back a document.

I just wanted to say that we aren't assuming anything like this. It 
doesnt seem necessary to get it exactly right. I will re-word this 
more directly.

>
>| 1.3 Interpretations
>
>I'm skipping ahead at this point...
>
>
>
>| 3.4 Datatyped interpretations
>
>| A datatype is identified by a uriref and is characterized
>| by a set of character strings called lexical forms
>| and a mapping from that set to a set of values.
>
>'identified by a uriref' is superfluous, right?
>Datatypes are no more nor less identified by uriref
>than are web pages, pigs, numbers, or fairies, right?

No, because the semantics assumes that datatype uriref is an actual 
name, not just a logical constant. It is required to have a single 
fixed meaning, not one that is hostage to different interpretations. 
Maybe I should make this clearer.

>broken link/anchor?
>http://www.coginst.uwf.edu/~phayes/RDF_Semantics_finalCall_1.html#ref-xml-schema2

Whoops, sorry. should be ...#ref-xmls.  Will fix.

>
>
>| we will refer to particular aspects of these conditions
>| as different kinds of information about a datatype.
>
>ugh... do we really need to?

Yes, strictly. There really is no way to state the datatype stuff as 
'pure' model theory and cover all the cases. Which rules apply when? 
It all depends on what you know. But see below.

>
>no, after reading more, I don't see where
>you actually do what you say you will here.

In section 4.3  BUt I will omit this stuff here and take it all over 
to 4.3, I think would be best.

>
>| where D is supposed to characterize information about some
>| set of datatypes
>
>ugh... why don't you/we just say
>
>	where D is a set of datatypes
>
>?

Because that doesnt define it properly. Some datatypes might provide 
more information that others. But OK, not important, I'll just change 
it to 'set of datatypes' and leave the "information"  talk until 
section 4.3, which is where it really belongs, OK?

>
>You later write things like ICEXT(I(rdfs:Datatype)) = D
>which seem to say that D is a set of datatypes.
>
>
>| any uriref beginning http://www.w3.org/2001/XMLSchema#sss
>
>awkward/redundant; either say
>	any uriref of the form ...#sss

Yes, that is what it should say. Will fix.

>or
>	any uriref beginning ...#
>
>| ICEXT(I(rdfs:Datatype)) = D
>
>hmm... isn't that more constraining than it needs to be?
>we only need ICEXT(I(rdfs:Datatype)) to include D, right?
>
>WRONG. (I think).

Im genuinely not sure about this. There are arguments both ways. We 
certainly want to say that if something is asserted to be in 
rdfs:Datatype then it is recognized. So the inclusion must be the 
other way: ICEXT(I(rdfs:Datatype)) is a subset of D; otherwise there 
could be a something of the type that isn't recognized. But if we 
also want to say that it is kosher to complain when information is 
unobtainable about some recognized datatype, then we probably should 
rule out having datatypes that are just kind of counted as recognized 
even if nobody has mentioned them, which would be possible if D were 
larger than the class extension. On balance it seems simpler just to 
say that they are identical.

>| a datatype violation MAY be treated as a error condition.
>
>seems superfluous. An error condition in
>what context/process/protocol?

Any inference process or entailment-checking protocol. What I have in 
mind is a complete reasoner which is kind of under contract to 
produce all valid entailments. It should be OK for it to barf when it 
finds a violation,  refuse to draw any further conclusions, and still 
count itself as complete.

Im not allowed to talk about complete reasoners, so I thought that 
just talking about error conditions was a neutral way of getting the 
point across.

Just calling this a datatype violation doesn't say anything about 
reasoning processes.

>
>Just giving it the name 'datatype violation' seems
>sufficient. (is that term anchored?
>hmm... it's not in the glossary. I guess
>the glossary is only for terms imported
>from conventional literature, not for novel
>terms in this spec. hmm... I hate having
>to view-source to find anchor names; it would
>be nice to have the terms introduced in this
>spec collected somewhere.)

OK, I could break out the definitions and give links to them, in the 
style used in the XML Schema doc. Thats a bit more work. I can 
probably get to that on Monday.

>
>| a graph which contains a literal with a non-well-formed
>| XML string or an illegal language tag, and which is typed
>| with rdf:XMLLiteral is always considered a datatype violation.
>
>eek... there are no RDF graphs with non-well-formed XMLLiteral
>thingies, are there? i.e. the things in the graph
>are post-xml-parsing, no? Ah... no, I guess they're
>pre-parsing strings. OK. very well.
>
>| Similarly, RDF does not assume that its literal strings
>| are identical to elements of the class xsd:string, even
>| though both are defined as sequences of unicode characters.
>
>Now that confuses me... if you're saying that RDF
>doesn't include a specification of what the elements
>of the class xsd:string are, and therefore it
>doesn't specify that literal strings are in that class,
>then very well. But surely in the case of D-interpretations,
>i.e. where the class xsd:string *is* specified,
>RDF's literal strings are in that class, no?

NO, we decided that explicitly, I believe, a while ago. That is, they 
might be but they might also not be.  The RDF strings are in the RDF 
syntax; the XSD strings are in a semantic value space somewhere. On 
what grounds do you think they must be identical?

>
>WRONG.
>
>Jan/Jeremy, please add this test case:
>
>	(empty graph)
>	XSD-entails
>	"abc" rdf:type xsd:string.

I think this should not be entailed. If it is, then we need to 
explicitly redefine simple literals to be XSD strings in all the 
documents.  This raises a host of issues about checking identity of 
things across datatypes, and whether RDF syntax can only be 
interpreted using XML Schema datatyping. .

>
>
>| Although the definition of entailment means that a D-inconsistent
>| graph D-entails any RDF graph, it will usually be inappropriate
>| to consider such entailments as useful consquences since they
>| would not be valid rdf- or rdfs- entailments.
>
>Really? Oh... had to read that 5 times before I got it.
>OK.
>
>| A datatype clash may be considered to be a datatype
>| error condition.
>
>I don't think that adds anything. I suggest striking it.

OK, can do.

>
>| It makes no provision for assigning a datatype to the range
>| of a property, for example
>
>???? what about
>	my:age rdfs:range xsd:integer.
>???
>
>WRONG.

Badly stated. I meant to indicate that we don't do range datatyping. 
Will rewrite to:

makes no provision for attaching a datatype to a property so as to 
apply to all values of the datatype.

>
>| and does not provide any way of explicitly asserting that
>| a blank node denotes a particular value under a datatype
>| mapping.
>
>/me sighs
>
>I sure wish we had specified that
>
>   _:fourtyTwo xsd:integer "42".
>
>works. Bummer. Rats. Frap.

Yeh, I tend to agree that would have been harmless and useful. Im 
still not sure how we lost that, to tell you the truth. It seems to 
have just got dropped and we were in too much of a hurry to pick it 
up again.

>
>| We expect that semantic extensions and future versions
>| of RDF will impose more elaborate datatyping conditions.
>
>Ah; good.
>
>(that sort of side-note is traditionally
>marked up with some NOTE: thingy. Hmm... the
>tradition hasn't been codified in
>http://www.w3.org/2001/06/manual/ )
>
>
>| A graph rdf-entails (rdfs-entails) another just when its
>| rdf-closure (rdfs-closure) simply entails it.
>
>ooh! nifty...

Hey, check out the Herbrand entailment lemma, for something really neat.

>will that work for universal quantification
>too? Ah... no...
>
>| It is not possible to provide such a tight result
>| for D-entailment closures since the relevant semantic
>| conditions require identities which cannot be stated in RDF.
>
>
>OK, I read carefully up thru...
>
>| 4.1 Rdf-entailment and rdf closures (Informative)
>
>That's it, for now, at least.

OK, thanks.

>
>--
>Dan Connolly, W3C http://www.w3.org/People/Connolly/


-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola              			(850)202 4440   fax
FL 32501           				(850)291 0667    cell
phayes@ai.uwf.edu	          http://www.coginst.uwf.edu/~phayes
s.pam@ai.uwf.edu   for spam
Received on Saturday, 14 December 2002 02:31:50 UTC