Re: datatypes and MT from Dan Connolly on 2001-11-06 (w3c-rdfcore-wg@w3.org from November 2001)

From: Dan Connolly <connolly@w3.org>
Date: Tue, 06 Nov 2001 02:13:30 -0600
To: Pat Hayes <phayes@ai.uwf.edu>
CC: w3c-rdfcore-wg@w3.org
Message-ID: <3BE79BAA.78B10494@w3.org>
Pat Hayes wrote:
> 
> >Pat Hayes wrote:
> >[...]
> >>  Let me try to first summarize the MT changes that I managed to
> >>  extract from the pfps/ph interchange
> >
> >I object to this proposal entirely.
> >
> >>  For example, the following graph written in bnode-style:
> >>
> >>  aaa bbb _:1 .
> >>  _:1 rdf:value "345" .
> >>  _:1 rdf:type xsd:integer .
> >>
> >>  would be boiled down into:
> >>
> >>  aaa bbb _:1:"345" .
> >>  _:1 rdf:type xsd:integer .
> >
> >Let's not muck up the model theory like this.
> 
> It doesn't muck up the model theory.

To my eye, it does.

> >Let's
> >keep it simple:
> 
> ?? See below for why this isn't simple.
> 
> >abstract syntax:
> >       terms:
> >               constants (URIs w/fragids)
> >               string literals
> >               bnodes (existentially quantified variables)
> >       statement:
> >               term term term.
> 
> We've already gone into why this simple a syntax does not work.

Really? pointer? example?

It works to my satisfaction, after considerable study
and implementation experience. At least two other
WG members also said they prefer this abstract syntax.


> >       formula:
> >               statement*
> >with the traditional interpretation structure
> >(with the IEXT() indirection trick).
> >
> >That's it.
> >
> >If we want to say "my shoe size is some
> >integer whose decimal representation is '10'",
> 
> But I don't want to say that. I want to say that my shoe size is 10.

Yes, but we're not here to design a syntax; we're here to
clarify the existing one. And the existing one doesn't have
that expressive power, I don't believe.

> (If you know about shoe sizes, you ought to know what that "10"
> means.) I also, for similar reasons, don't want to have to say that
> the result of adding some integer whose decimal representation is "6"
> and anther integer whose decimal representation is "4" is a third
> integer whose decimal representation is "10"; I would much rather not
> have to even mention representations at all, and just write 4+6=10,
> like people have done for about the last millennium. And the reason
> in both cases is because it is SIMPLER.
> 
> Heres why it is simpler, apart from the obvious avoidance of needless
> literal-bloat. That innocent-looking decimalRep only works if you
> change the semantic rules at that point.

???

> Rather than treat the
> literal as naming what it denotes, like all the other nodes in the
> RDF graph, you have to read it as naming itself.

???

> At literal nodes,
> and only at literal nodes, the semantic rules have to be rewritten
> just to accommodate this oddity. This would be OK if everyone were to
> keep very clear in their heads that the rules are being changed, but
> I suspect that is going to be too much to ask, and that some people,
> not unreasonably, are going to interpret literals like all other
> names (that being the way they are usually understood everywhere
> else) and will thus be dropped into a mire of confusion; and others,
> perhaps, may push the balance the other way, and start treating all
> names as self-denoting (a tendency that the generic DPH seems to have
> innately in any case), which is an even deeper mire.

???

The usual interpretation of "10" -- e.g. in KIF -- is
a string of two characters, no? I don't see what's unusal
about what I'm suggesting.


> The great advantage, to me, of the MT extension is that it keeps the
> RDF simpler. The graphs are simpler, and the inferences are simpler.
> All datatyping inference is done by RDFS up to the point where
> rdf:type connects a datatype to a literal, then that datatype mapping
> handles that literal; that's all most people will need to know. Also
> the basic semantic rules are simpler; ALL labels denote in the same
> way, for example, and datatype value domains are classes, as any
> reasonable user might expect. The only thing that is more complicated
> to state is the model theory itself, for technical reasons; but even
> there, the extra complexity only affects literals in the presence of
> datatyping, and can be factored away from the rest of the MT very
> cleanly. And most of the complexity is just in stating datatypes
> themselves as semantic constructs (which is why we need DT and DTS
> and DTC).

I don't see the simplicity anywhere.

> >that's easy:
> >
> >       :me :shoeSize _:x.
> >       _:x rdf:type :integer.
> >       _:x :decimalRep "10".
> >
> >which can be written in RDF/xml 1.0 very easily:
> >
> >       <rdf:Description rdf:about="#me">
> >         <shoeSize>
> >             <integer decimalRep="10"/>
> >         </shoeSize>
> >         </rdf:Description>
> >
> >To fill in the details... let dt:
> >be the namespace of XML Schema primitive datatypes,
> >and let rdfs:str be a new property
> >that relates XML Schema datatype to strings;
> >it's unambiguous over each of the primitive datatypes;
> 
> Are you sure?

More or less, yes.

> What about things like leading zero suppression?

I just made up the terms "integer" and "decimalRep"
but in the refinement below, such details are attended
to by way of the work of the XML Schema WG:

  "Leading and trailing zeroes are optional."
  -- http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#decimal


> But in
> any case, the natural way to describe a datatype mapping is going
> from the lexical domain to the value domain. You are forced to talk
> about the inverse of this mapping. As well as being unnecessary, this
> assumes that there always is a unique inverse mapping.

Yes, as I said, the mapping is unambiguous. This doesn't seem
all that awkward to me.

> >in the case of dt:string, it's the identity relation.
> >
> >       <rdf:Description rdf:about="#me">
> >         <shoeSize>
> >             <dt:decimal rdfs:str="10"/>
> >         </shoeSize>
> >         </rdf:Description>
> 
> I still have trouble with RDF/XML,  so sorry if I'm confused, but
> does that create *three* triples, with two bnodes in common:
> 
> #me shoeSize _:1 .
> _:1 dt:decimal _:2 .
> _:2 rdfs:str "10" .

no... the <shoeSize> element is a property element,
and the <dt:decimal> element is a typednode... you
might check out Brickley's explanation of "striping"
in RDF syntax. It parses as:

        <...#me> <...#shoeSize> _:x.
        _:x <...rdf-syntax-ns#type> <...#integer>.
        _:x <...#decimalRep> "10".


> (If not, how does the 'rdfs:str' get into the second triple??)

it's a propattr; i.e. it's a property of the thing which
is of type decimal.

> >I use rdfs:str rather than rdf:value because M&S 1.0
> >(and some dublin core documentation) suggest that
> >rdf:value is for some wierd sort of currying.
> 
> I agree that rdf:value seems to be both a bad term and hopelessly
> compromised at this point.

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Tuesday, 6 November 2001 03:13:24 UTC