In defense of S

Recently, there've been a wave of criticism of the S proposal on the
list. I'd like to address a central concern here.

It seems that S is disliked by some because it seems to suggest a
certain "idiom", or "data coding style", which deviates from many
existing use cases.

I've pointed out many times that this idiom is not the only way of
using S, but this issue keeps being catcalled over and over again,
sometimes using very unconstructive arguments.

To remind, the essence of the S proposal is to denote datatype
mappings using RDF properties. I believe this is easy enough to
understand. Doing so enables two things:

  1) referring to value space and lexical space of
     a datatype mapping explicitly
  2) referring to individual elements of value spaces explicitly

This is pretty much it. The document I've been working on illustrates
one possible idiom of how elements of *value spaces* can be used and
referred to by applications. Let me call it "Idiom A":

Idiom A
=======

Range of a property like dc:date is defined as the *value space*
of a datatype:

  dc:date --rdfs:range--> [] <--rdfs:domain-- xsd:duration

Therefore, to refer to individual elements of the value space
we write:

  X --dc:date-> [] --xsd:duration--> "2001-12-04"

Idiom B
=======

Another way of using datatype mappings is to define the range of a
property as the *lexical space* of a datatype:

  dc:date --rdfs:range--> [] <--rdfs:range-- xsd:duration

Therefore, we write:

  X --dc:date-> "2001-12-04"

In this case, the property values of dc:date are lexical tokens that
belong to the lexical space of xsd:duration. However, since there is
exactly one duration value that corresponds to each lexical token,
applications may exploit this fact to determine what durations are
meant.

Epilogue
========

Proposal S *does not* state that A, B or some other idiom is the best,
normative way of modeling typed values (although the document I've
been working on focuses on A so far).

Most of the time, whether we use A or B is plain irrelevant given a
scope of a specific application, as long as the one or the other idiom
is used consistently. Complications arise when we start integrating
different applications, or when more sophisticated datatype modeling
is required (like union of types in XSD). My belief is that for such
involved scenarios the Idiom A is superior.

Received on Tuesday, 4 December 2001 14:39:28 UTC