Re: ACTION-432: ugly ill-formed literals use case...

Axel Polleres wrote:
> 
> Michael Kifer wrote:
>>> Ok, here another use case where one might get ill-formed literals.
>>>
>>> I have some RDF data.
>>>
>>> :a :age "old".
>>> :b :age "33".
>>> :c :age "young".
>>> :d :age "88".
>>>
>>> and want to write a rule which converts the untyped
>>> literals to xsd:integer typed ones where possible.
>>>
>>> Ideally, I would like to have the option to write something in
>>> RIF like:
>>>
>>>    ?X[:age->&ex:createTypedLiteral(?Y,"xsd:integer"^^xsd:anyURI)] :-
>>>        ?X[:age->?Y] and &isInteger(?Y).
>>>
>>> where ex:createTypedLiteral is a built-in function creating
>>> a typed literal from a string and a datatype IRI and
>>> isInteger is a type checking builtin.
>>>
>>> Now if I drop the last condition
>>>
>>>    ?X[:age->&ex:createTypedLiteral(?Y,"xsd:integer"^^xsd:anyURI)] :-
>>>        ?X[:age->?Y].
>>>
>>> I will get ill-formed literals. :-(
>>>
>>>   you might argue, that functions to "construct"
>>> new literals are not so nice, but I think in practical
>>> RDF transformation use cases they are important.
>>>   Well, one could argue also of course that the definition of
>>> ex:createTypedLiteral should be in a way that returns an error
>>> on an ill-formed result... but I don't know whether we can prevent 
>>> that, if we allow built-ins to be extensible.
>>
>> A constructor should return valid literals. So, it should give an 
>> error here.
> 
> basically that is what I discussed with Jos.
> Maybe we should have in the datatypes and built-ins document that:
> 
> - new datatypes have to define their lexical and value space.
> - any built-in function are expected to return only well-formed literals
> - any built-in predicates should return an error when any of it's 
> argument is bound to an ill-formed literal
> - the predefined interpretation of any built-in predicate should not 
> include any tuple involving ill-formed literals
> 
> ok? because that is what I we have to put in rather the
> BLD or the DTB document?

That all seems reasonable to me.

The reason RDF allows ill-formed typed literals is in pursuit of the 
open world goals. An RDF processor can receive data using types it 
doesn't know about and pass them on safely - it may not do any useful 
computation with them but it can round trip, return query answers and 
copy values.

If RDF required all processors to raise an error on receiving ill-typed 
data then if it saw a literal value whose type it didn't understand it 
would have to complain even if the value is in fact legal.

It would be nice to preserve this feature in RIF, though arguably RIF 
processors are mostly going to be computing with the data and so need to 
know about the corresponding builtins anyway.

Dave
-- 
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England

Received on Friday, 22 February 2008 10:54:28 UTC