- From: Steve Rosenberry <steve.rosenberry@verizon.net>
- Date: Thu, 29 Mar 2001 14:08:33 -0500
- To: www-xml-schema-comments@w3.org
I recently raised the issue of deriving an attribute datatype that combines a float value with a string units of measurement designation in the xml.org xml-dev list. After some discussion, the best solution currently available with the proposed recommendation appears to be declaring a string attribute and then applying an appropriate pattern restriction to limit the attribute to the allowable digits, decimal points and units designations in the proper order. The point was then made that doing this eliminates the possibility of type checking the attribute values against any inclusivity and exclusivity facets in a type-specific manner, i.e. one cannot specify an upper or lower limit on the float portion of the attribute. Without getting into a discussion of all the possible potholes one can fall into, especially with regards to unit conversions before applying inclusivity/exclusivity facets, the question to the W3C Working Group, "Is there merit to allowing a datatype to be specified as an atom in a regular expression?" I suggest something similar to the Unicode Database encoding, e.g. \p{Lu} specifies all upper case letters. For discussion purposes, let's assume \x{datatype_name} is the adopted syntax. A very simple example follows declaring a datatype to represent a percentage value with the '%' required in the actual XML, its use in defining an element's attribute, and the subsequent declaration of an XML element that would validate against the schema: <!-- declare the datatype using the proposed /x{} syntax --> <xsd:simpleType name="Percentage"> <xsd:restriction base="xsd:string"> <xsd:pattern value="\x{xsd:float}%" /> </xsd:restriction> </xsd:simpleType> <!-- declare an element schema using the Percentage datatype --> <xsd:element name="AVCommand"> <xsd:complexType> <xsd:attribute name="volume"> <xsd:simpleType> <xsd:restriction base="Percentage"> <xsd:minInclusive value="12%" /> <xsd:maxInclusive value="45%" /> </xsd:restriction> </xsd:simpleType> </xsd:attribute> </xsd:complexType> </xsd:element> <!-- An actual element as defined in an XML file --> <AVCommand volume="25%"/> The clarity, definitiveness, and simple readability for someone working with actual XML files using the above syntax allows XML programmers to more easily self-document the use of the applicable Schema's in their specific application. In addition, the attribute can be fully validated for a float value within the inclusive range and that the required units measurement is in place. Obviously, this example could easily be expanded to include a number of different units of measure or even go beyond the units of measure paradigm. Restricted string attributes could be built up using different enumerated string simpleTypes in different positions to ensure a particular attribute order in a single string -- something I believe is also a point of discussion regarding the XML Schema specification. I would also expect that any parser worthy of handling regular expressions as they are currently defined should be able to extend itself to handling this new syntax with a minimum of effort. Surely the two above examples of a use for this syntax are not the only cases that can benefit from a regular expression syntax that would allow separate atoms to be validated against specific datatypes. Is there merit to allowing a datatype to be specified as an atom in a regular expression? -- Steve Rosenberry Sr. Partner Electronic Solutions Company -- For the Home of Integration http://ElectronicSolutionsCo.com http://BetterGoBids.com -- The Premier GoTo Bid Management Tool (610) 670-1710
Received on Thursday, 29 March 2001 14:09:04 UTC