- From: <ewallace@cme.nist.gov>
- Date: Thu, 13 Jan 2005 12:57:25 -0500 (EST)
- To: public-swbp-wg@w3.org
My review of "XML Datatypes in RDF and OWL" [1] Overall, this is a good document. It discusses a number of issues related to the use of datatypes in RDF and OWL that were left unresolved by the Recommendations. It is comprehensive in addressing the issues discussed: covering alternative approaches and providing appropriate references and/or quotes as necessary. In fact, because of this comprehensiveness and the importance of the references, reviewing this document was more of a project than I had originally envisioned (although reading these references proved enlightening).. I have no major issues with this document, although I do have some lesser concerns and comments. These fall into two categories: general and detailed. The detailed concerns were already presented in an email sent to the list yesterday [details]. The general concerns follow below. * The document covers a number of loosely related subjects. It is like a bag of datatype issues and other related material. Different parts will be of interest to different audiences. I mentioned this before, but my main concern now is that someone reading linearly through the document will encounter the interpretation descriptions in 1.2, 1.3, and 1.4 and stop reading. I think such material would be better placed as an appendix. It was also not clear to me the purpose and role of such material in this document. By role, I mean are the interpretation descriptions in 1.2 and 1.3 quotes from the RDF and OWL semantics documents respectively or a different form for the same content? * An important reference for datatypes in computing environments is the ISO standard on Language-independent datatypes - ISO/IEC 11404:1996. It provides an excellent framework for describing datatypes and appears to have been a strong influence on the XML Schema base types document [2] (which includes a reference to 11404). The XSCH note could benefit referencing the ISO work directly and using some of its terminology, although I don't think that this is necessary for this iteration of the note. * My primary interest in these datatype issues is with the treatment of numeric types being consistent with their use in engineering applications (or at least usable by those applications). Loss in precision or unexpected changes in values due to automatic type conversion could be problematic in an engineering environment. Engineering view of some numeric types: To explain the engineering point of view on this, let me mention three important numeric types for that domain: count, measurement, and constant. A count is an integer representing essentially the cardinal number for a set of things classified by some set of tests. An example would be the count of packages of candy available for shipment. A count is an exact number. Tests may include measurements, but a count is not an approximation of a sum of these measurements nor is it a sum of the approximation of these measurements. A measurement is an inexact numeric value (usually represented as a real) produced by some measurement method. This value denotes a value range which includes the actual value. The actual value is unknowable, but more precise measurement methods can reduce the range of uncertainty up to a point. The precision or uncertainty is usually included with the measurement value. Either implicitly using significant figures or explicitly using a seperate property value such as error range. A constant is an exact value used in computation. It may or may not be possible to express exactly as a numeric. An inch is exactly 2.54 centimeters, but Pi is not 3.14159. This suggests some potential needs and concerns for a type system underlaying this. 1. Because the value spaces for these types are different, measurements are disjoint from counts and constants. 2. Some means of capturing precision or error/uncertainty is needed for measurement values. 3. Some means is needed for denoting constants that cannot be expressed precisely in numeric form. Some answers about how 1 and 2 can/must be handled with XML Schema types are revealed in the XML Schema Datatypes document. In [2] the description for Decimal explicitly states that, "Precision is not reflected in this value space, the number 2.0 is not distinct from the number 2.00." Thus precision cannot be encoded in decimal values or other types derived from or constructed with Decimal. Meaning: that objects must used to state precision or error properties for measurements (this is not a bad approach any since there are often other properties or metadata associated with a measurement as mentioned previously by Bernard [3]). Measurements on the SW are thus not datatypes and the disjoint type issue becomes mute. For issue 3, there remains no answer. As far as I know there is no way to denote a rational value without using a numeric literal, but many important values cannot be expressed precisely as numeric literals. Information on these issues may belong in this datatype note or not, I am not sure. I do think that the SWBPD wg should present these issues in some one of its notes, though. -Evan [1] http://www.w3.org/2001/sw/BestPractices/XSCH/xsch-sw/ [2] http://www.w3.org/TR/xmlxschema-2/ [3] http://lists.w3.org/Archives/Public/public-swbp-wg/2004Dec/0119.html [details] http://lists.w3.org/Archives/Public/public-swbp-wg/2005Jan/0040
Received on Thursday, 13 January 2005 17:57:30 UTC