Re: Comments on new datatyping document, part 1

[Patrick Stickler, Nokia/Finland, (+358 50) 483 9453, patrick.stickler@nokia.com]


----- Original Message ----- 
From: "ext Graham Klyne" <GK@NineByNine.org>
To: "RDF core WG" <w3c-rdfcore-wg@w3.org>
Sent: 09 September, 2002 18:10
Subject: Comments on new datatyping document, part 1


> 
> These comments are with reference to 
> http://www-nrc.nokia.com/sw/rdf-datatyping.html, as modified by 
> http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Sep/0040.html, point 
> 1.  (I can't tell for sure if the other changes proposed there will affect 
> my comments below, so I'll make them anyway.)
> 
> I've focused my review on sections 2.1, 2.2, 2.3, 3.1, 3.2 and 4, since 
> they seem to be the substantive core that we've agreed to adopt.
> 
> My most serious comment concerns section 3.1 (and, by association, 3.2).
> 
> 
> 2.1 rdfs:Datatype - OK
> 
> 
> 2.2 Datatype Mapping - OK, I think.
> 
> Is it necessary to indicate that the XML flag and language tag are a 
> significant part of the literal value mapping?  

No. It is necessary to make clear that they are *not* a significant
part of the literal to value mapping.

> For example consider whether:
>    < <xsd:integer>"25"   , 25 >
>    < <xsd:integer>"25"-en, 25 >
> are distinct members of a datatype mapping.  Similarly, are the following 
> distinct?
>    < <xsd:integer>"25"   , 25 >
>    < <xsd:integer>xml"25", 25 >

The XML flag and xml:lang code do not participate in any way
with datatyping semantics. They are invisible/ignored/discarded/whatever
when considering the L2V mapping. Only the unicode string portion is
relevant, and it is taken, alone, to represent a lexical form, a
member of the lexical space of the datatype.

Also, Part 1 does not define any participation of XML literals in
datatyping, only non-XML literals.

Thus, all of the following typed literal nodes denote the very
same value (ten):

   <http://...#integer>"10"
   <http://...#integer>"10"-en
   <http://...#integer>"10"-fi
   <http://...#integer>"10"-sp
   <http://...#integer>"10"-en_UK

etc.

And the following are disallowed

   <http://...#datatype>xml"LLL"
   <http://...#datatype>xml"LLL"-xx
   <http://...#datatype>xml"LLL"-xx_XX


XML literals are not datatyped (at least as far as Part 1 is 
concerned. As an aside, I think they *could* be datatyped,
with complex datatypes, but that remains in Part 2 and is not
part of the recent concensus.

> 2.3 Typed Literal
> 
> I think the discussion of "validity" belongs here with the definition of a 
> typed literal, rather than in the section 3 on designation in RDF.  Rather 
> than defining the concept of "validity" it may be simpler to simply say 
> that the lexical form part of a typed literal MUST be in the lexical space 
> of the identified datatype.

Well, it all comes down to how one determines if it *is* in fact
in the lexical space. Also, there may be other mechanisms which could
assert what the value of the property is, and that the property takes
a unique value, therefore, the whole L2V mapping comes into play.

And of course, testing whether a lexical form is in the lexical space
of a datatype and/or whether the value to which that lexical form maps
is equal to all other values asserted for that property, in the case
of a unique value constraint, must happen outside the scope of RDF
proper.

> 
> 3. Designation of Typed Literals in RDF
> 
> I think this is just about how to represent a typed literal, so I don't 
> think the discussion of validity should be part of this (see above).
> 
> I think the N-triples syntax examples should include a case corresponding 
> to the XML flag being set.

As an error case, sure, since an XML literal should not be a typed
literal.

> 
> 3.1 Global Datatyping Assertions
> 
> I have potential serious concerns with this section.  I think it could be 
> dropped without harm to the basic structure of local typing.
> 
> Also part my concern is that I understand there to be a desire to deal with 
> the issue of literal datatyping separately from the issue of tidiness of 
> untyped literals.  I think this section defeats such separation.

This section has nothing whatsoever to do with tidyness/untidness or
untyped literals. It may be unclear, or maybe you should re-read it.

It only concerns itself with how existing rdfs:range semantics applies
to datatype values, just as it does to any property values asserted to
be members of a given RDF class.

Global datatyping assertions do not function without explicitly typed
literals.

> The basis of my concern, reinforced by the 1st paragraph of this section, 
> is that it *could* lead to non-monotonicity depending upon a decision about 
> tidyness of untyped literals.  The 1st para says:
> [[
> It is often convenient to associate a datatype with a property, so that 
> every use of the property can be understood as asserting a particular 
> datatype for every value.
> ]]
> which I think is re-introducing the very aspect of datatyping that was 
> previously causing us such grief.  

Well, it may need some word smithing, yes, to be sure it's not
introducing global *implicit* datatyping, as it's not.

And also, there is no way we can constrain users from making
rdfs:range assertions for datatype classes, given that we provide
a mechanism (locally typed literals) to represent members of
the value space (class extension) of datatype classes, and many
applications utilize such range assertions to test for contradictions,
as a means of constraining property values accordingly.

I.e., global datatyping assertions are provided for by rdfs:range,
and we can't take that away, we can only try to make clear how
it relates to datatype classes in particular.

> If we suppose that untyped literals are 
> tidy, so we can have entailments like:
> 
> The second paragraph of this section reads:
> [[
> RDF Datatyping employs rdfs:range to associate a datatype class with a 
> particular property. The associated datatype may be taken to to constrain 
> all values of the property to correspond to members of the value space of 
> the designated datatype, and according to the characteristics of RDF 
> datatypes thereby also constrain all lexical forms to members of the 
> lexical space of the datatype.
> ]]
> The problem, I think, is with the clause beginning "and according to ...".

Yes. It needs rewriting. It's not meant to say what you seem to be
reading into it. It has nothing whatsoever to do with inline literals,
only explcitely typed literals, and of course, that should be made clear.

> ...
> 
> 3.2  Datatype Clashes
> 
> I think this section is closely related to 3.1, and cannot comment further 
> until 3.1 is adequately clarified.  If section 3.1 is dropped, then I think 
> this section too should be dropped.
> 
> 
> 4. RDF Datatyping Model Theory
> 
> I think there's a problem with statement (1):
> 
>    ICEXT(I(ddd)) = {x : <x,y> in IEXT(I(ddd))}
> 
> This condition is expressed in terms of IEXT(I(ddd)), but I don't see the 
> earlier sections describing a datatype URI as denoting a datatype mapping 
> relation.  I think the intent here can be obtained by saying:
> 
>    ICEXT(I(ddd)) = { x : EXISTS(y) and x = L2V(I(ddd))(y) }
> 
> ...

The MT section has been awaiting review by Pat and Sergey for
quite some time... I make no assertions that it is in any way
sane or rational.

Patrick

Received on Tuesday, 10 September 2002 05:22:06 UTC