Re: an idea -- using different literal classes for each datatype from Dave Reynolds on 2007-08-29 (public-rif-wg@w3.org from August 2007)

From: Dave Reynolds <der@hplb.hpl.hp.com>
Date: Wed, 29 Aug 2007 09:03:11 +0100
To: Sandro Hawke <sandro@w3.org>
Cc: RIF WG <public-rif-wg@w3.org>
Message-ID: <46D5283F.1010408@hplb.hpl.hp.com>
Sandro Hawke wrote:
> Dave Reynolds <der@hplb.hpl.hp.com> writes:
>> Sandro Hawke wrote:
>>> I just had another idea ...
> ...
>>>              subclass Literal_int
>>>                  property intValue: xs:int
> ...
>>> I like how this makes the ASN a more complete description of the
>>> dialect.  It's very XML-happy, I think -- a schema processor will get
>>> the right types for everything.
>>>
>>> Not using rdf:datatype makes things a bit harder for RDF folks,
>> This seems to mean to you have to extend the RIF abstract (and thus 
>> concrete) syntax to add in a new datatype. Whereas in RDF and OWL no 
>> language change was needed to support a greater range of XSD types than 
>> the minimum. This strikes me as a desirable and successful feature which 
>> it would preferable if RIF could emulate.
> 
> The issue here is how many extensibility points there are.  I am kind of
> liking the idea of having exactly one: every dialect has a different
> syntax, and the inputs/outputs you process can be completely summed up
> by naming which dialects you process.  If a system implements BLD and
> also the datatype xs:double, then we just say it handles a new dialect
> (which, somewhere, is defined to be BLD along with an xs:double
> subclass, as above.)

I prefer the notion that things like primitive datatypes are modular and 
you can support an additional datatype without having to define a new 
dialect.

In particular in a system where the datatype is simply a URI label then 
once you have the URI you are done - no syntax extension, no namespace 
issues. This means that when you are lifting datatypes from an existing 
spec such as XSD the where the URI is clear then everyone will choose 
the same one with no need for further coordination.

Contrast this with the situation where you have to extend the RIF syntax 
to add a new datatype. An implementer should not choose the RIF 
namespace for their new syntax element, and can't use any existing spec, 
which means they have to make one up in a namespace under their control. 
This means that every group is essentially forced to create a different 
element unless there is coordination.

> Extensibility is about being able to feed a RIF document in dialect D1
> to a processor that can only handle dialect D2.  We want non-conflicting
> syntaxes, so if the document is in the intersection of D1 and D2, it can
> simply be read as being in D2, and we also need some way to do a managed
> fallback if the document uses bits of D1 that are not in D2. 

True but extensibility is also about encouraging convergence in the 
extensions and part of that is to simplify the ability to reuse things 
like datatypes (and associated functions and operators) defined by other 
groups. Indeed we have an explicit RIF requirement to not have too many 
dialects. A generic syntax with datatypes as a modular extension point 
seems better able to meet that need.

> Whatever
> this fallback mechanism is, I like the idea of being able to apply it to
> datatypes (and builtins, and any other areas where implementation
> profiles might differ).
> 
> It's hard to actually make this kind of decision without experience with
> that fallback mechanism.
> 
>>> but I'm
>>> not sure the semantics of rdf:datatype were right anyway. 
>> ??
>>
>>> Honestly, I
>>> have no idea if, in RIF, "p(1)" is supposed to be conveyed they same as
>>> "p(0x01)" -- that is, is the argument supposed to be the integer one, or
>>> is it supposed to be some string with an associated datatype.  
>> The argument is a typed literal which comprises a lexical form (the 
>> string) and a datatype URI, which together denote a value determined by 
>> the lex->value mapping associated with the datatype.
> 
> I think I agree that's what the argument should be in a RIF translation
> of "p(1)".  The question is how you convey that thing (the typed literal
> which comprisses a lexical form and a datatype URI) in XML.
> 
> The straightforward but verbose answer is:
> 
>      <TypedLiteral>      <!-- style 1 -->
>          <datatype>&xsd;int</datatype>
>          <lexicalRepresentation>1</lexicalRepresentation>
>      </TypedLiteral>
> 
> I don't really know how to do it using rdf:datatype.   In one version I
> did it like this:
> 
>      <rdf:Description>
>          <value rdf:datatype="&xsd;integer">1</value>
>      </rdf:Description>

Surely that would be:

     <TypedLiteral>
        <value rdf:datatype="&xsd;integer">1</value>
     </TypedLiteral>

> but even aside from the weirdness of "value", I think the semantics are
> subtly wrong.   Eh...   Maybe it's too subtle for me.   Somehow the
> level of quoting, and access to the syntax seems wrong.

I don't think the semantics is wrong. I agree that using value is not 
that compelling.

> It seems to me that 
>      <intValue>1</intValue>
> 
> (perhaps with a redundant Literal_int wrapper, a subclass of
> TypedLiteral) has essentially the same semantics as you state, but is
> much more xml-friendly than style-1 example above.

It's not XML-friendly in that the set of allowed types is then part of 
the XML Schema and so can't be modularly extended. You can't add a 
datatype without violating the schema. That might be a deliberate goal 
but I think it is brittle and to be avoided.

A syntax like:

     <TypedLiteral xsi:type="&xsd;integer">1</TypedLiteral>

would allow that modularity and would surely also be XML-friendly 
(though not directly RDF/XML compatible if that is an issue).

Dave
-- 
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England
Received on Wednesday, 29 August 2007 08:03:16 UTC