Re: Datatypes, abstract syntax / data model. from Graham Klyne on 2002-09-14 (w3c-rdfcore-wg@w3.org from September 2002)

From: Graham Klyne <Graham.Klyne@MIMEsweeper.com>
Date: Sat, 14 Sep 2002 08:51:10 +0100
To: Jan Grant <Jan.Grant@bristol.ac.uk>
Cc: RDFCore Working Group <w3c-rdfcore-wg@w3.org>
Message-Id: <5.1.0.14.2.20020914083946.038f6a40@127.0.0.1>

Jan,

A question:  is it therefore your intention that a literal expressed in a 
non-canonical form is not permitted to appear in a graph?

e.g. the pair <xsd:integer,"010"> does not describe a valid literal?

The reason I ask this is because I was trying to figure how to achieve your 
stated goals without insisting that a canonical form be used, but could not 
think of a way to do so without invoking some kind of processing model 
(e.g. an implementation MAY change a literal representation to canonical 
form ...).

I agree with the motivation that we should not get hung up on preserving 
different lexical forms for typed literal values.  (If the literal form is 
important, then it should be presented as a string, right?)  However, I do 
think that the datatype used should ideally be preserved.

Hmmm... can we get away with saying something like: "graphs that differ 
only in the specific literal forms used to represent the same values may re 
regarded as interchangeable"?  (That's a fudge, I'm trying to stop short of 
saying they're equal, because that imposes a burden on applications to know 
about the datatypes.  Semantically, I think it's clear that they convey the 
same meaning, anyway.)

#g
--

At 04:23 PM 9/13/02 +0100, Jan Grant wrote:

>My vote was for Jos' suggestion, which was: a datatyped literal is
>held (in the abstract syntax) to be the pair
>
>         ( datatype uri, clex )
>
>..where clex is the canonical lexical form for the datatype; secondly,
>to make the assertion that such a canonical form exists.
>
>
>Patrick S asked, "what if there is no such form?" Well, I choose to
>accept the axiom of choice, which (I think) covers this*. Anyhoo -
>that's the sum total of the proposal as I heard it from Jos, and seemed
>a happy middle-ground. This is why:
>
>
> >From an implementation point of view, using "values" and sneaky DB
>primitive types is acceptable. This works, because the DB primitive type
>might be viewed as simply another non-canonical lexical form.
>
> >From a roundtripping point of view, this is acceptable, since we are
>basically trying to roundtrip from abstract syntax (or data model) to
>abstract syntax, not preserve "spurious" details about the lexical
>forms. If we don't know anything about the lexical form, then no harm is
>done.
>
>Also, note that no canonicalisation is _required_ from an
>implementation, and that's OK too. If an implementation doesn't know
>about a datatype, there's absolutely nothing it can do better than that
>(the open-world view); if it happens to know about closed-world XSD
>stuff then it may be able to offer more conclusions (Jenny and John have
>the same age).
>
>jan
>
>PS. In many ways, this is just another way of spelling out why Jeremy and
>Patrick differ in opinion, but can see each others' point of view - I
>think.
>
>Some of the reason for this differing of opinion seems to be the subtle
>semantics attached to words like "abstract", "lexical" and "syntax". I
>dunno what to do about that.
>
>
>* that is, for any datatype, there exists a canonicalisation function
>which selects from the set of equivalence sets of lexical forms one
>preferred form.
>
>--
>jan grant, ILRT, University of Bristol. http://www.ilrt.bris.ac.uk/
>Tel +44(0)117 9287088 Fax +44 (0)117 9287112 http://ioctl.org/jan/
>User interface? I hardly know 'er!

-------------------
Graham Klyne
<GK@NineByNine.org>

Received on Saturday, 14 September 2002 03:29:49 UTC