Re: DATATYPING: initial draft

At 03:37 PM 11/20/01 -0800, Sergey Melnik wrote:
>The datatyping document I'm working on it available at
>
>http://www-db.stanford.edu/~melnik/rdf/datatyping/
>
>It is still very rough, but I'm posting the link so you can ensure that
>I'm still on track (I'll be on a Thanksgiving trip Thu-Sun, so I wanted
>to have something out before that).
>
>Looking forward to your comments,

Sergey,

I think this is a great basis for discussion.  My comments on the document:


+ General:  it might be easier to discuss if we had section numbers.


+ Deliverables:  I'm not sure that we should regard a framework for 
*defining* datatypes as a deliverable.  That seems to be beyond our 
charter.  A framework for *using* seems enough (for now).


+ Desiderata:  I'd like to add the following:
   - Compatibility with existing usage of RDF.


+ Datatype mapping:  I would view XSD "facets" as part of the XSD mechanism 
for defining datatypes.  I think it's enough that we focus on the value 
space, the lexical space and the mapping.


+ Datatype mapping, definition, second item:  do we really need to specify 
that each element of the value space has a lexical representation?  I'm not 
violently opposed, but I'm not sure what purpose it serves, and it seems to 
make the task of defining a value space more difficult (e.g. how does one 
define a data type whose lexical space is of the form  0.dddddE+ee  (using 
decimal digits), plus a couple of oddities, and whose value space is some 
flavour of IEEE floating point numbers.  Defining an exact correspondence 
seems to me like it could be rather hard work).


+ Facets:  I agree with the thrust of this, and would go further.  I'd not 
even talk about facets being the subject of further RDF work.


+ Representation of datatype mappings and values:   this is clearly a point 
I'd previously misunderstood, so I'll raise the question for discussion 
explicitly.  What should the URI of an XSD datatype actually denote?  The 
value space, the lexical space, the mapping or something else?  And should 
we allow/encourage/require the introduction of "parallel" URIs to name the 
parts not thus named?

(My previous comments have been to the effect that if the XSD URI is 
regarded as denoting the lexical space, existing usage can retain its 
expected interpretation.)


+ xsd:base64Binary, etc, @@@ comment:   doesn't the xsd:base64Binary value, 
as you have defined it, bridge the gap between octets and characters, thus:
   [Some octet sequence] xsd:base64Binary [some character sequence] .
?


+ Unit System:  I tend to think this is out of scope, to the extent that 
the basic mechanisms you present can handle measure-space as a kind of 
value space.  For applications that wish to do the kind of measure 
distinction that you describe, then the measure space and number space can 
be treated like ordinary RDF.  In your diagram, weight, age, inKg, inYears, 
inMonths are just ordinary RDF properties.  The remaining properties are 
what this note is about.

I think this part is fine as an example, but that we shouldn't even start 
to get caught up in defining unit system vocabularies.  I think each such 
vocabulary should have a working group in its own right.


+ FAQs:  Here are some suggested answers.

- we don't explicitly distinguish a datatype property from non-datatype 
property.  Why bother?  (Isn't this a strength of this approach?)

- How do we refer to value spaces/lexical spaces?  Additional vocabulary 
defined as required.  (c.f. my suggestions of "parallel" namepsaces for XSD.)

- Attach two datatype properties to a node?  Yes.  Why not?  (Especially if 
they're just like other properties.)  I think the existence or otherwise of 
a model can handle any issues of consistency that might arise.

- Can write (X age "12")?  Of course!  Whether or not it is useful to write 
depends on the definition (interpretation) of "age".

- Refer to datatype values without using datatype properties?  I see no 
reason why not.  They are just denoted values, right?  Whether one can know 
that the value "belongs" to a datatype may be trickier.  I am thinking of 
saying things like "the mass of X is the same as the mass of Y".  I think 
the issue of defining URI scheme w/semantics is out of scope for us.


+ TODO, how datatypes are defined (described?) in MT:  are they not just 
properties with a relational extension defined according to some datatype 
value space/lexical space/mapping?


+ TODO, how to define datatypes in RDF:  I think this is out of scope.


#g



------------------------------------------------------------
Graham Klyne                    MIMEsweeper Group
Strategic Research              <http://www.mimesweeper.com>
<Graham.Klyne@MIMEsweeper.com>
       __
      /\ \
     /  \ \
    / /\ \ \
   / / /\ \ \
  / / /__\_\ \
/ / /________\
\/___________/

Received on Thursday, 22 November 2001 07:10:44 UTC