- From: Boris Motik <boris.motik@comlab.ox.ac.uk>
- Date: Wed, 9 Jul 2008 17:58:02 +0100
- To: "'Alan Ruttenberg'" <alanruttenberg@gmail.com>
- Cc: "'OWL Working Group WG'" <public-owl-wg@w3.org>
Hello, > -----Original Message----- > From: Alan Ruttenberg [mailto:alanruttenberg@gmail.com] > Sent: 09 July 2008 15:18 > To: Boris Motik > Cc: 'OWL Working Group WG' > Subject: Re: A possible structure of the datatype system for OWL 2 (related to ISSUE-126) > > > On Jul 8, 2008, at 5:16 PM, Boris Motik wrote: > > Hello, > > > > 1. Datatype Map > > ---------------- > > I wonder if we should still use the term "datatype", as there will > likely be confusion with the xsd sense of datatype. > > > A datatype map consists of the following things: > > > > - a set of datatypes > > - each datatype provides a set of allowed facets > > - a possibly infinite set of constants (likely to be renamed to > > literals, but I'll stick to "constant" for the moment) > > - each constant consists of a lexicalValue and a typeURI > > - it is written as "lexicalValue"^^typeURI > > > > Each datatype DT is assigned a value space DT^D, which is just a > > nonempty set. > > Is the implication that DT -> Value space DT^D, one to one? > Yes. This is exactly the same as in the case of classes. > So we have type, DT, DT^D ? > I didn't really understand that. > > Each constant c is assigned a value c^D, which is just an object > > from the union of the value spaces of all datatypes. > > > > > > Thus, a datatype can be thought as a class with a predefined > > extension. > > I'm not sure explaining it this way is helpful - might confuse rather > than illuminate. > I actually think this is the proper way of thinking about datatypes. Take, for example, owl:integer: you can think about it as one big, infinite nominal that contains all integers. Hence, owl:integer is a class in a sense that its interpretation contains things. The main difference between datatypes and classes is that, in the case of datatypes, the interpretation is uniquely defined by the datatype map. > > Note that this definition does not assume any relationship between > > the set of supported typeURIs (which determine the allowed > > constants) and the set of datatypes (which determine the allowed > > sets of values). > > I think we should consider calling "typeURI" "lexicalFormURI" to > suggest the correct thinking, as people tend to equate "type" and > "class". (as with rdf:type) > I agree. > Can we not simplify the above to: There are "Value spaces" and > "lexicalFromURI"s. I'm not seeing how having "Datatypes" as an > additional concept helps. > There is a distinction, albeit a subtle one. "Datatype" is a syntactic category; hence, you can put datatypes into property range axioms. "Value space" is a semantic category. Hence, you can't work with value spaces at the level of a syntax; that is, you don't put *the set of all integers* into an ontology when you say "xsd:integer is a range of P"; rather, you put xsd:integer (a datatype), which acts as a moniker for its value space. > > > > 2. Allowed datatypes > > --------------------- > > > > Comformant OWL 2 implementations would be required to support the > > following base datatypes, each of whose value spaces would be > > disjoint: > > > - owl:number - the value space is the set of all real numbers > > - xsd:string - the value space is the set of all Unicode strings in > > normal form C > > - owl:internationalizedString - the value space set is the set of > > pairs of the form (string,langTag) > > - xsd:hexBinary - the value space is the set of all finite > > sequences of octets > > I'm wondering whether we should simply say: OWL has the following > (following your later mail). > > owl:Number > owl:CharacterString > owl:BitString > owl:Integer > > We confuse the issue by using the xsd uris to name a different sort > of thing (an OWL value space, not an XSD:type) > I personally don't mind renaming all datatypes to owl:*. I can see, however, people might object, partially because of a backwards compatibility issue. After all, in OWL 1, you had xsd:string and xsd:integer. > > The following datatype would also be supported in OWL 2: > > > > - xsd:integer - the value space is the subset of the value space of > > owl:number containing all integers > > See above. > > > Finally, we might support the following "shortcut" datatypes, whose > > value spaces can be defined from the value spaces of the above > > mentioned datatypes using facets > > > > - various xsd:integer derivatives, such as xsd:int and xsd:long > > - various xsd:string derivatives, such as xsd:Name > > In order to keep the design clean, I'd suggest that we define these > in the owl namespace. We can connect the xsd types to the owl version. > > However: The use of e.g. xsd:string in restrictions is the common > idiom. I think we should document that some xsd datatypes, when used > in a restriction, are understood to mean certain owl value spaces. > > > 3. Allowed constants > > --------------------- > > > > Conformant OWL 2 implementations are required to support the > > following constant types: > > > > - "nnn"^^xsd:int and all derivatives that fall within xsd:int - all > > such constants are to be interpreted as elements of owl:number > > - "aaEbb"^^xsd:float - all such constants save for NaN and +-inf > > are to be interpreted as elements of owl:number > > Consider extending owl:number with these constants. We need some > interpretation of them if they are to remain intact when part of an > OWL file. These are effectively, "promotion" rules. > We can have owl:numberPlus (or owl:numberOnSteroids if you prefer :-) that contains these guys as well. > > - "abc"^^xsd:string - interpreted as "abc" > > as you later suggest, ("abc", null) or ("abc", "") . The latter > avoids the issue of what to do about the pattern facted on lang. > > > - "abc"@langTag - interpreted as a pair ("abc",langTag) > > > > > > 4. Discussion > > -------------- > > > > The set of constants is chosen such that implementations don't need > > to support numbers with arbitrary precision, which might be quite > > cumbersome. In fact, implementations are only required to support > > 32 bit integers and single precision floating point numbers. > > On today's hardware, I would set this to be 64 bit integers or even > 128 bit integers, and double precision float. Some machine's don't > really have single float hardware, instead rounding from double float. > I don't mind going up to 64 bit. 128 might be a bit too much (at least in Java -- a language in which many reasoners are implemented you don't have this). > > There are efficient ways to represent these on virtually all systems. > > > > The set of datatypes, however, allows one to refer to the sets of > > all integers and real numbers. This allows one to specify the > > ontology in a way that makes reasoning easy. > > > > Implementations are free to support other constants as well. Note > > that these extensions do not necessarily mean that we need new > > datatypes (i.e., new value spaces). For example, an implementation > > might choose to support arbitrary precision numbers via constants > > of the form "123.03"^^xsd:decimal. Note that the proposed list of > > datatypes already contains the appropriate value space for such > > constants (i.e., owl:number). > > I think xsd:decimal should be considered a lexical form of owl:Number. > > > The open issues are what to do with NaN and +-inf and with date- > > time datatypes. > I think that, if we agree to the basic structure, we can easily accommodate the remaining "extra" constants and datatypes. Regards, Boris > In the first case, I suggest above that owl:Number be real+"NaN"+"- > INF"+"+INF" > I'd also suggest that "-0" and "+0" be considered lexical forms of > the number 0. > > For the date-time datatypes, I wonder whether it would work to define: > > owl:Time (isomorphic to the reals) > owl:TimeZoneTime (also isomorphic to the reals) > > There is one value space for all the lexical date-times have time > zone specified, and another value space for all the lexical date- > times. There would be no comparison possible between owl:Time and > owl:TimeZoneTime. > > There would still be work necessary to determine whether the > repeating interval types, like monday, are feasible to implement. > > -Alan > > > > > Regards, > > > > Boris > > > > > >
Received on Wednesday, 9 July 2008 16:59:38 UTC