W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > November 2001

Re: datatypes and MT

From: Pat Hayes <phayes@ai.uwf.edu>
Date: Tue, 6 Nov 2001 19:21:40 -0600
Message-Id: <p05101044b80e364e1e72@[]>
To: Dan Connolly <connolly@w3.org>
Cc: w3c-rdfcore-wg@w3.org
>  > >abstract syntax:
>>  >       terms:
>>  >               constants (URIs w/fragids)
>>  >               string literals
>>  >               bnodes (existentially quantified variables)
>>  >       statement:
>>  >               term term term.
>>  We've already gone into why this simple a syntax does not work.
>Really? pointer? example?

It's in the archive somewhere. First, having literals on arcs doesn't 
make sense. Next, we need a way to say when the same node is being 
used in two places, so we need nodeIDs or some such, even on 
literals.  Third, most seriously, having bnode labels on arcs 
requires some notion of scope (in the graph, not the Ntriples doc.) I 
would add that it forces us to incorporate nodeIDs (actually better 
called arcIDs in this case) into the RDF graph itself, which is a 
'slippery slope' idea that several people didn't like: Frank Manola, 
I believe, was one.

>It works to my satisfaction, after considerable study
>and implementation experience. At least two other
>WG members also said they prefer this abstract syntax.

I would like it too if it could be made to work. If we weren't 
restricted to simple graphs and obliged to only use urirefs as names, 
then it would work fine.

>  > >       formula:
>>  >               statement*
>>  >with the traditional interpretation structure
>>  >(with the IEXT() indirection trick).
>>  >
>>  >That's it.
>>  >
>>  >If we want to say "my shoe size is some
>>  >integer whose decimal representation is '10'",
>>  But I don't want to say that. I want to say that my shoe size is 10.
>Yes, but we're not here to design a syntax; we're here to
>clarify the existing one. And the existing one doesn't have
>that expressive power, I don't believe.


Pat shoeSize "10" .

illegal RDF, in the current syntax? (Assuming that 'Pat' and 
'shoeSize' were urirefs, of course.)? Because I read that as saying 
that my shoe size is ten.

>>  (If you know about shoe sizes, you ought to know what that "10"
>>  means.) I also, for similar reasons, don't want to have to say that
>>  the result of adding some integer whose decimal representation is "6"
>>  and anther integer whose decimal representation is "4" is a third
>>  integer whose decimal representation is "10"; I would much rather not
>>  have to even mention representations at all, and just write 4+6=10,
>>  like people have done for about the last millennium. And the reason
>>  in both cases is because it is SIMPLER.
>>  Heres why it is simpler, apart from the obvious avoidance of needless
>>  literal-bloat. That innocent-looking decimalRep only works if you
>>  change the semantic rules at that point.
>>  Rather than treat the
>>  literal as naming what it denotes, like all the other nodes in the
>>  RDF graph, you have to read it as naming itself.
>>  At literal nodes,
>>  and only at literal nodes, the semantic rules have to be rewritten
>>  just to accommodate this oddity. This would be OK if everyone were to
>>  keep very clear in their heads that the rules are being changed, but
>>  I suspect that is going to be too much to ask, and that some people,
>>  not unreasonably, are going to interpret literals like all other
>>  names (that being the way they are usually understood everywhere
>>  else) and will thus be dropped into a mire of confusion; and others,
>>  perhaps, may push the balance the other way, and start treating all
>>  names as self-denoting (a tendency that the generic DPH seems to have
>>  innately in any case), which is an even deeper mire.
>The usual interpretation of "10" -- e.g. in KIF -- is
>a string of two characters, no? I don't see what's unusal
>about what I'm suggesting.

OK, Ive been assuming that the quote marks around literal labels are 
only a syntactic device for marking them as literals, not intended to 
be interpreted as actual quotation marks. They are an inheritance 
from XML, right? And XML is completely sloppy about use and mention, 
and uses quotation syntax to mean all kinds of things.  If we are 
obliged to interpret those as genuine quote marks, then I give up; 
there is nothing to debate: all literals are character strings, end 
of story. But then if I write

Pat shoeSize "10"

then I am saying that my shoe size is a character string, right?

 From here on I will omit the double quotes to avoid confusion.

>  > The great advantage, to me, of the MT extension is that it keeps the
>>  RDF simpler. The graphs are simpler, and the inferences are simpler.
>>  All datatyping inference is done by RDFS up to the point where
>>  rdf:type connects a datatype to a literal, then that datatype mapping
>>  handles that literal; that's all most people will need to know. Also
>>  the basic semantic rules are simpler; ALL labels denote in the same
>>  way, for example, and datatype value domains are classes, as any
>>  reasonable user might expect. The only thing that is more complicated
>>  to state is the model theory itself, for technical reasons; but even
>>  there, the extra complexity only affects literals in the presence of
>>  datatyping, and can be factored away from the rest of the MT very
>>  cleanly. And most of the complexity is just in stating datatypes
>>  themselves as semantic constructs (which is why we need DT and DTS
>>  and DTC).
>I don't see the simplicity anywhere.

Well, I do. The motivating example for me was the use of range 
information to fix a datatype, as in

aaa shoeSize 10 .
shoeSize rdfs:Range xsd:integer .

which seems to me to be eminently simple. Obviously, the literal 
should be interpreted using xsd:integer. No need to apply rules or 
rewrite using bnodes; you just use the literal  in the triple.

>>  >that's easy:
>>  >
>>  >       :me :shoeSize _:x.
>>  >       _:x rdf:type :integer.
>>  >       _:x :decimalRep "10".
>>  >
>>  >which can be written in RDF/xml 1.0 very easily:
>>  >
>>  >       <rdf:Description rdf:about="#me">
>>  >         <shoeSize>
>>  >             <integer decimalRep="10"/>
>>  >         </shoeSize>
>>  >         </rdf:Description>
>>  >
>>  >To fill in the details... let dt:
>>  >be the namespace of XML Schema primitive datatypes,
>>  >and let rdfs:str be a new property
>>  >that relates XML Schema datatype to strings;
>>  >it's unambiguous over each of the primitive datatypes;
>>  Are you sure?
>More or less, yes.
>>  What about things like leading zero suppression?
>I just made up the terms "integer" and "decimalRep"
>but in the refinement below, such details are attended
>to by way of the work of the XML Schema WG:
>   "Leading and trailing zeroes are optional."
>   -- http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#decimal
>>  But in
>>  any case, the natural way to describe a datatype mapping is going
>>  from the lexical domain to the value domain. You are forced to talk
>>  about the inverse of this mapping. As well as being unnecessary, this
>>  assumes that there always is a unique inverse mapping.
>Yes, as I said, the mapping is unambiguous. This doesn't seem
>all that awkward to me.

But if leading zeros are optional, the lexical-to-value mapping is 
not invertible, so what is its inverse supposed to be?

>>  >in the case of dt:string, it's the identity relation.
>>  >
>>  >       <rdf:Description rdf:about="#me">
>>  >         <shoeSize>
>>  >             <dt:decimal rdfs:str="10"/>
>>  >         </shoeSize>
>>  >         </rdf:Description>
>>  I still have trouble with RDF/XML,  so sorry if I'm confused, but
>>  does that create *three* triples, with two bnodes in common:
>>  #me shoeSize _:1 .
>>  _:1 dt:decimal _:2 .
>>  _:2 rdfs:str "10" .
>no... the <shoeSize> element is a property element,
>and the <dt:decimal> element is a typednode... you
>might check out Brickley's explanation of "striping"
>in RDF syntax. It parses as:
>         <...#me> <...#shoeSize> _:x.
>         _:x <...rdf-syntax-ns#type> <...#integer>.
>         _:x <...#decimalRep> "10".

Ah, OK.  Thanks. So this is the same proposal we are debating right now, good.

IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
Received on Tuesday, 6 November 2001 20:21:54 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:24:06 UTC