Re: review of rdf:text, dated 2008-11-04

On 6 Nov 2008, at 13:43, Jos de Bruijn wrote:
[snip]
>> I think you are not quite grasping the issue here. (I do prefer  
>> finite
>> alphabets myself, fwiw.) The point is how to design the type so  
>> that it
>> is extensible to additional characters that will definitely be  
>> added (by
>> unicode). Note that problems along these lines have already  
>> occurred in
>> XML land. I don't think we can *merely* punt on this.
>
> What problems are there in XML land?

http://norman.walsh.name/2004/09/30/xml11

> In any case, like XML, this definition relies on ISO/IEC 10646, which
> has provisions for extensions. It seems to me that this is the
> appropriate way to deal with extensibility; it's not necessary to  
> define
> our own mechanism.

We're not defining our own mechanism per se, we're trying to  
accommodate the fact of extensions.

> In general, I think that this datatype should be based on the XML  
> schema
> string datatype, and if there are problems with extensibility, they
> should be solved in XML schema.

It *is* based on XML schema strings.

>> (And the problem is that future changes will change the meaning of  
>> some
>> ontologies. I presume that this will be true for some RIF rulesets if
>> you have the appropriate facets and builtins.)
>
> We don't have such problems in RIF, because we don't allow built- 
> ins in
> rule head.

I fail to see how that matters. You'll get different answers to  
builtins so you'll have different rules firing merely depending on  
the admissible characters.

> Further, if future changes in data types potentially pose a problem  
> to a
> particular language, the specification of that language should deal  
> with
> this problem. I suspect that OWL 2 does something like that for the
> string data type.

This is how it's done :)

You should talk with Boris more than me. I'd personally prefer a  
finite alphabet and bite the bullet on extensions (I think...). But  
you don't seem to be acknowledging the issue.

Cheers,
Bijan.

Received on Thursday, 6 November 2008 17:45:29 UTC