[TED] rif:text datatype

I am writing this e-mail in partial fulfillment of action 369 and to
report on the status of the rif:text specification problem.

Background of the problem:
For the purpose of representing text in different natural languages and
identify these natural languages, RDF has a specific kind of symbol
called "plain literal with language tag", which is a combination of a
string and a language tag (e.g. "wine"@en, "vin"@fr).  For the purposes
of compatibility with RDF and being able to write rules about RDF
graphs, it was decided (*) in RIF to allow users to write such strings
with language tags.
However, it was also decided in RIF that, for the purpose of uniformity,
(**) we only have one kind of symbol which is a combination of a string
and an IRI identifying a symbol space (a datatype is a special type of
symbol space).  So, we cannot use plain literals with language tags
directly in RIF.
In order to allow users to write strings with language tags but still
satisfy the requirement that there is only one kind of symbol in RIF, it
was proposed to create a new datatype (rif:text) with an appropriate
definition of a lexical space, a value space, and a lexical to value
mapping.  The datatype is defined in the current draft of the BLD
specification [1].  There was a subsequent understanding that it would
be a good idea coordinate the definition of the datatype with the XML
Schema working group, since this working group has defined many
datatypes which we are using.

I sent an e-mail [2] to this working group asking whether they have
considered creating such a (built-in) datatype, or whether they would be
interested in coordinating the efforts on the specification of this
datatype.

Response from the XML schema working group:
As I expected, the reply I received [3] said that the XML schema working
group would probably not be interested in specifying such a datatype,
because it is not "simple" enough; values are pairs of strings and
language tags, rather than atomic values. In the XML world, one would
use a complex type for this, using elements or attributes for specifying
these two separate things (string and language tag).
Unfortunately, this doesn't really help us, because of the decision (**)
to have only one kind of symbol (combination of string and symbol space
IRI). [4]

Next steps:
As I see it we basically have two options:
a) not change anything: we already have a specification of the rif:text
datatype which we can use for the representation of strings with
language tags.  The XML schema working group does not appear to be
interested in coordinating the specification of this datatype, so we
will just do it on our own (in fact, we have already).  One might argue,
though, that it is not our business to specify datatypes.

b) drop the constraint (**) that we only have one type of constant
symbol, and specify a specific kind of symbols for writing down strings
with language tags.  The drawback of this option is that the language
definition might seem somewhat more complex.  Furthermore, if we go down
this path, I think we should define special kinds of symbols for other
classes of symbols as well (specifically, IRIs, local symbols); there
would no longer be a real justification for using symbol spaces for the
representation of these symbols (since we would not longer have the
constraint (**)).
Following option b) would probably alleviate the concern raised in issue
42 [5].

I myself (still) have a preference for option b), but if this option is
not acceptable to the working group, then I will not oppose option a).


Best, Jos


[1] BLD specification
[2]
http://lists.w3.org/Archives/Public/www-xml-schema-comments/2007JulSep/0102.html
[3]
http://lists.w3.org/Archives/Public/www-xml-schema-comments/2007OctDec/0040.html
[4] Of course, the XML representation of RIF could use two separate
elements/attributes for the representation of rif:text values.
[5] http://www.w3.org/2005/rules/wg/track/issues/42

-- 
                         debruijn@inf.unibz.it

Jos de Bruijn,        http://www.debruijn.net/
----------------------------------------------
In heaven all the interesting people are
missing.
  - Friedrich Nietzsche

Received on Wednesday, 7 November 2007 16:09:58 UTC