Re: datatyping discussion

Brian McBride wrote:
> 
> Hi Sergey,
> 
> sans chapeau
> 
> Sergey Melnik wrote:
> 
> [...]
> 
> >
> > 1. SUGGESTED APPROACHES
> > =======================
> >
> > All suggested approaches can be roughly divided into two groups,
> > "typed instances" and "schema-based typing" (also called weak and
> > strong typing [OL]). In the former approach, the typing information is
> > attached directly to the data values, whereas in the latter the typing
> > information is provided in some (typically external) schema or rule
> > set. Examples and references to some concrete suggestions follow
> > below.
> 
> I too had divided the approaches in two, though I had a different split.  An
> alternative dichotomy is around how the typing information is represented.  We
> can either have a literal as a structured entity, of which the type is a part,
> or the typing info can be represented in the triple structure.

Absolutely. This dimension is orthogonal to the "typed"-vs.-"schema"
dichotomy, but is cleaner, and I like it more overall. I even think this
is something we can vote on as soon as at the coming telecon!
 
> In the case where typing information is represented in schema, can typing
> information only be represented in a schema?

I believe so. In (S4) I gave an example where typing information is
given using a rule which may reside in an external schema (John_Smith's
weight as a pieces-of-eight number). In other words, the instance data
may be lacking the typing info completely.

>  i.e. is it possible to represent
> typing info in a graph which is not a schema?

Are you asking whether typing info can appear in instances only so that
no schema is needed at all?

> [...]
> 
>  >
>  >(S3) use bNodes [M&S,TBL]
>  >
>  >    Examples: John_Smith weight [units Pounds, rdf:value "10"], or
>  >              John_Smith weight [pounds [decimal "10"]]
> 
> Hmmm, what do we mean by type here?  Are 'Pounds' a type?  Methinks the type is
> either integer or float.  Pounds are a unit.

I reckon we have to consider both numeric typing and unit-like typing. I
agree that there might be a critical distinction between them, but we
must be careful not to shut the door for any of the two.
 
> Is 'decimal' a type?  I'm thinking it is a coding scheme for integers (or floats
> for that matter.  Why do I feel the gentle tugging of an enormous black hole
> opening up in front of me here.  Time to go read the XSD spec.

Well, the XSD spec distinguishes between "value spaces" and "lexical
spaces". Thus, "decimal" has got both. In fact, I'm thinking hard
whether it makes sense to make this distinction explicit, so that value
spaces and lexical spaces may be defined independently. For example, it
may make sense to speak of a value space of say Java integer
[-2147483648..214783647] vs. a lexical space of integers that can be
represented using a hexadecimal string of a certain length. As another
example, consider two disjoint lexical spaces for decimals: one in which
each lexical token has exactly one digit before the dot (e.g. "1.23E1")
vs. the other one in which use of "E" is prohibited (e.g. "12.3"). These
two lexical spaces are disjoint and can both be used alternatively to
encode decimals.

Sergey

Received on Monday, 22 October 2001 21:17:26 UTC