Re: ISSUE-126 (Revisit Datatypes): The list of normative datatypes should be revisited

On Jun 19, 2008, at 3:49 AM, Boris Motik wrote:
>> I also think we need to clarify what being a supported datatype
>> means.
>
> In OWL 1, people often complained that different systems supported a  
> radically different subset of XML Schema datatypes. Therefore,
> we decided to included a more extensive list of datatypes into OWL 2  
> and make them normative -- that is, each OWL 2 reasoner that
> wants to have a compliance label should support all of them (with  
> all the allowed facets).

I think this is an assumption that needs to be looked at  more  
carefully. For example, as you have offered in your point 3, given the  
choice of not being able to use a datatype at all, versus being able  
to use it with limitations, many would choose the limitations.

>> In particular it should be clarified whether this means that
>> tools will simply reject ontologies that mention these types, even as
>> annotation values,
>
> I believe we should not make a distinction between what is supported  
> in annotations and what is supported in class descriptions.
> This is likely to cause confusion and is going to lead to a very  
> complex specification. I believe we should carefully select the set
> of datatypes that we think we can support in class descriptions, and  
> we should make them normative. Clearly, nothing prevents tools
> from supporting other datatypes and allowing them in class  
> descriptions and/or annotations, but I'd leave this out of the spec.

I'm not sure I agree. The spec could simply say that all xml literals  
are valid values for annotation properties. What is the complication  
that I'm missing?

>>  If there were no nary-datatypes and no facets, would any of these  
>> datatypes be a problem?
>
> This issue has nothing to do with n-ary predicates; it addresses the  
> current (unary) case.

OK - this is the facets case. So if we disallowed facets on all the  
datatypes you have on your exclude list, then they would not cause  
problems? I note that there is still value for them - they can be used  
for domain/range, in cardinality restrictions...

>> Of your points below, 2 and 3 don't seem problematic from my point of
>> view (internationalized string is missing from the list in 3- an
>> oversight I presume). The others require more thought on my part.
>>
>
> I haven't seen any ontology using, say, xsd:gYearMonth; therefore, I  
> really believe that my first point should not be contentious.

http://swoogle.umbc.edu/index.php?option=com_frontpage&service=search&queryType=search_swd_ontology&searchString=gYearMonth&searchStart=1

> Furthermore, there is nothing that would prevent people from  
> implementing the remaining datatypes.

Yes, but as we know, this means that this doesn't offer much benefit  
to users - it's the OWL 1 situation.

> Regarding 4, I really don't believe that people would see a  
> difference in the consequence in practice, but this would make the  
> spec
> much cleaner. Here is an example of what might go wrong. Imagine you  
> have the following ontology:
>
> (1) PropertyRange( R DatatypeRestriction( xsd:float minExclusive f1  
> maxExclusive f2 ) )
> (2) PropertyRange( S DatatypeRestriction( xsd:float minExclusive f1  
> maxExclusive f2 ) )
> (3) DisjointProperties( R S )
> (4) ClassAssertion( SomeValeusFrom( R rdfs:Literal ) a )
> (5) ClassAssertion( SomeValeusFrom( S rdfs:Literal ) a )
>
> Now this ontology is satisfiable iff the data range  
> DatatypeRestriction( xsd:float minExclusive f1 maxExclusive f2 )  
> contains two or
> more floating point values. To determine this, you thus need to be  
> able to determine whether there are at least two different values
> between f1 and f2.
>
> My first observation is that implementing this correctly is not  
> trivial. Note that you can't simply subtract the two numbers; you
> need to take into account that each floating point number is  
> actually represented as m*2^e and then you need to do some really  
> nasty
> operations.
>
> My second observation is that, in practice, nobody will care: f1 and  
> f2 will be typically sufficiently apart so that there will be
> plenty of numbers between them for you to choose from.
>
> But then, we have a problem for the implementors: in practice, the  
> precise inference will probably never be relevant, but they still
> have to provide it because of a possible corner-case. Note that,  
> because xsd:float is discrete and finite, it is principally
> possible to choose f1 and f2 such that there is exactly one floating  
> point number between them, which would then make the ontology
> unsatisfiable. Thus, to have a 100% correct implementation, people  
> will have to provide this nasty code.

This is very helpful. Thanks!

Even with your proposal, which addresses satisfiability, this is an  
issue for easy keys.

> Hence my suggestion: let us modify the type system such that a  
> precise implementation can be produced efficiently and such that it
> provides the intuitive answers. In this example, this means that  
> xsd:float would be treated as the set of all real numbers. This
> makes it much easier to answer the above question: if f2 > f1, there  
> are infinitely many real numbers between f1 and f2! Thus, in
> all practical cases, we are getting the result that the users  
> wanted, and we are not requiring the implementors to jump through
> hoops.

Will think on this. Thanks!

-Alan

>
>
> Regards,
>
> 	Boris
>
>
>> Thanks,
>> Alan
>>
>> On Jun 18, 2008, at 3:23 PM, Boris Motik wrote:
>>
>>>
>>> Hello,
>>>
>>> So here is a proposal for resolving this issue.
>>>
>>> 1. We exclude xsd:time, xsd:date, xsd:gYearMonth, xsd:gYear,
>>> xsd:gMonthDay, xsd:gDay, xsd:gMonth, and xsd:base64Binary from the
>>> list
>>> of supported datatypes. Note that this doesn't preclude people from
>>> implementing them (if they can figure out how to do this).
>>>
>>> 2. We define xsd:anyURI to be a subset of xsd:string.
>>>
>>> 3. We allow the "pattern" facet only on the following datatypes:
>>> xsd:string, xsd:anyURI, xsd:normalizedString, xsd:token,
>>> xsd:language, xsd:NMTOKEN, xsd:Name, and xsd:NCName.
>>>
>>> 4. We introduce a new owl:real datatype. This datatype would allow
>>> for the following types of constants:
>>>
>>> - rational numbers written according to http://www.w3.org/2007/OWL/
>>> wiki/OWL_Rational
>>> - floating point numbers written in the format as specified in the
>>> definition of xsd:float and xsd:double in the XML Schema
>>> - decimal numbers as written in the format as specified in the
>>> definition of xsd:decimal
>>> - integer numbers as written in the format as specified in the
>>> definition of xsd:integer and related datatypes
>>>
>>> Furthermore, we would make xsd:float and xsd:double (and possibly
>>> xsd:decimal as well) synonyms for xsd:real. This would be the only
>>> definition from the XML Schema datatype system: there, some very
>>> large numbers are not members of xsd:float. I believe, though, that
>>> this would bother people in practice.
>>>
>>> Finally, we can include xsd:nonPositiveInteger,
>>> xsd:negativeInteger, xsd:long, xsd:int, xsd:short, xsd:byte,
>>> xsd:nonNegativeInteger,
>>> xsd:unsignedLong, xsd:unsignedInt, xsd:unsignedShort,
>>> xsd:unsignedByte, and xsd:positiveInteger with the existing
>>> semantics as
>>> usual.
>>>
>>> Regards,
>>>
>>> 	Boris
>>>
>>>
>>
>
>

Received on Thursday, 19 June 2008 08:28:20 UTC