Re: datatyping discussion

Hi Sergey,

sans chapeau

Sergey Melnik wrote:

[...]


> 
> 1. SUGGESTED APPROACHES
> =======================
> 
> All suggested approaches can be roughly divided into two groups,
> "typed instances" and "schema-based typing" (also called weak and
> strong typing [OL]). In the former approach, the typing information is
> attached directly to the data values, whereas in the latter the typing
> information is provided in some (typically external) schema or rule
> set. Examples and references to some concrete suggestions follow
> below.


I too had divided the approaches in two, though I had a different split.  An 
alternative dichotomy is around how the typing information is represented.  We 
can either have a literal as a structured entity, of which the type is a part, 
or the typing info can be represented in the triple structure.


In the case where typing information is represented in schema, can typing 
information only be represented in a schema?  i.e. is it possible to represent 
typing info in a graph which is not a schema?

[...]

 >
 >(S3) use bNodes [M&S,TBL]
 >
 >    Examples: John_Smith weight [units Pounds, rdf:value "10"], or
 >              John_Smith weight [pounds [decimal "10"]]

Hmmm, what do we mean by type here?  Are 'Pounds' a type?  Methinks the type is 
either integer or float.  Pounds are a unit.  Perhaps this should be:

     John_Smith weight [units pounds, rdf:value
                                       [rdf:type xsd:int, rdf:value "10"]]

if I've got my n3 correct.

Is 'decimal' a type?  I'm thinking it is a coding scheme for integers (or floats 
for that matter.  Why do I feel the gentle tugging of an enormous black hole 
opening up in front of me here.  Time to go read the XSD spec.


> 
> 2. CRITERIA
> ===========
> 
> Below is a non-exhaustive list of several criteria that can be used
> for deciding on the suggested approaches. I picked the criteria that
> affect applications critically.
> 
> (C1) backward compatibility wrt existing data and applications


This is an important issue and any answer has a long future so I think we have 
to 'get it right'.   If we think that the best approach would require us to make 
changes that are in conflict with our charter, I'd rather see us resolve the 
charter issue and do the right thing.


> 
> (C2) comparing values for custom or unknown datatypes
>      (Is myint:05==myint:5? Given _x1 decimal "5" and _x2 decimal "5",
> is _x1==_x2?)
> 
> (C3) is typing information self-contained or requires external schema
> [DC2]


That doesn't seem like a criterion, unless we specifiy that one of these is 
preferable to the other.  I have a strong preference for not requiring an 
external schema.


> 
> (C4) are multiple type assignments allowed? (e.g. US dollar, decimal)


As above, I don't see either of these as a 'type', so I'm not sure this 
critereon is well formed.  Nor is it a criterion, unless a preference for one or 
the other is specified.


> 
> (C5) compactness (verbosity of serialization, storage efficiency in
> databases, elegant APIs)


I'd suggest another.  We would have a lot of explaining to do if webont were to 
use a different typing mechanism from us.  An important consideration is that we 
choose an approach that works for that community, currently represented by 
DAML+OIL.  In saying that, I'm not suggesting that we must adopt their current 
solution; but we must strive for a solution acceptable to them.

Brian

Received on Friday, 19 October 2001 09:09:18 UTC