- From: Costello, Roger L. <costello@mitre.org>
- Date: Thu, 21 Jun 2012 16:44:42 +0000
- To: "xmlschema-dev@w3.org" <xmlschema-dev@w3.org>
Hi Folks, Below is a discussion of the rule of least power and how it applies to XML Schema design. The rule of least power is very cool. Comments welcome. /Roger The rule of least power says that given a choice of suitable ways to implement something, choose the least powerful way. The following example illustrates the rule of least power. The XML Schema enumeration facets are less powerful than regular expressions (used by the XML Schema pattern facet), which is less powerful than XPath (used by the XML Schema 1.1 assert element): enumerations < regular expressions < XPath expressions Given the task of declaring an element to have a value that is one of a finite list of strings (or some other simple data type), you should declare the element using the least powerful method—enumerations. Example: Create an XML Schema that declares a Color element to have one of these strings: red, white, or blue. One way to implement Color is with a simpleType that lists the values using enumeration facets: <xs:element name="Color"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="red" /> <xs:enumeration value="white" /> <xs:enumeration value="blue" /> </xs:restriction> </xs:simpleType> </xs:element> A second way to implement Color is with a simpleType that lists the values using a regular expression in a pattern facet: <xs:element name="Color"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="red|white|blue" /> </xs:restriction> </xs:simpleType> </xs:element> You should use the first way, not the second. Why? Answer: With the first way it is easier for applications to analyze the XML Schema Color element for its list of valid values. In the second way applications must understand the regular expression (regex) language. Although this particular regex is simple, the regex language is complex and creating applications to determine what set of strings an arbitrary regex accepts is non-trivial. The regular expression language has more power than enumeration facets. If your task is to constrain the value of an element to a specified list of values, then use enumeration facets, not the pattern facet. There is a third way to implement this example, using the XML Schema 1.1 assert element: <xs:element name="Color"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:assert test=". = ('red', 'white', 'blue')" /> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> In the assert element the value of the test attribute is an XPath expression: <xs:assert test="XPath" /> XPath is a powerful language, even more powerful than regular expressions. For applications to analyze the XML Schema Color element for its list of valid values will require applications to understand the XPath language--a daunting task indeed. Lesson Learned: When creating an XML Schema, determine the suitable ways of implementing each feature and choose the one with the least power. W3C Paper: Tim Berners-Lee and Noah Mendelsohn wrote a wonderful paper on the rule of least power: http://www.w3.org/2001/tag/doc/leastPower.html Here are a few fascinating snippets from the paper: Powerful languages inhibit information reuse. Expressing constraints, relationships and processing instructions in less powerful languages increases the flexibility with which information can be reused: the less powerful the language, the more you can do with the data stored in that language. Less powerful languages are usually easier to secure … Because programs in simpler languages are easier to analyze, it's also easier to identify the security problems that they do have. … characteristics that make languages powerful can complicate or prevent analysis of programs or information conveyed in those languages … Indeed, on the Web, the least powerful language that's suitable should usually be chosen. This is The Rule of Least Power … the suggestion to use less powerful languages must in practice be weighed against other factors. Perhaps the more powerful language is a standard and the less powerful language not, or perhaps the use of simple idioms in a powerful language makes it practical to use the powerful languages without unduly obscuring the information conveyed. Overall, the Web benefits when less powerful languages can be successfully applied.
Received on Thursday, 21 June 2012 16:45:11 UTC