- From: Michael Kay <mike@saxonica.com>
- Date: Mon, 03 Jan 2011 20:09:43 +0000
- To: "Costello, Roger L." <costello@mitre.org>
- CC: "xmlschema-dev@w3.org" <xmlschema-dev@w3.org>
On 03/01/2011 19:44, Costello, Roger L. wrote: I can't add to your list of advantages/disadvantages, but I would seriously question why you want to impose a limit of 100 characters on a string. Some people seem to do this as an ingrained habit - they haven't got rid of the punched-card mentality where strings were always fixed length. There may be good reasons for doing it - for example, the data is going to be processed by an ancient COBOL application with limits that you can't afford to change; or you want to protect against certain kinds of DOS attack - but most of the time I see this kind of thing, the constraints are spurious. For example, people will put a limit of 10 characters on a phone number because they've never travelled widely enough to realize that's not a hard limit at all. Michael Kay Saxonica > Hi Folks, > > I am interested in hearing your thoughts on the advantages and disadvantages of the following two approaches to restricting the length of a string value. > > Approach #1: In this simpleType the regex does not restrict the length; instead, the minLength and maxLength facets are used to restrict the length: > > <simpleType name="English-language-family-name"> > <restriction base="string"> > <minLength value="1" /> > <maxLength value="100" /> > <pattern value="[a-zA-Z' \.-]+" /> > </restriction> > </simpleType> > > > Approach #2: Here is the same simpleType except the length restriction is implemented in the regex: > > <simpleType name="English-language-family-name"> > <restriction base="string"> > <pattern value="[a-zA-Z' \.-]{1,100}" /> > </restriction> > </simpleType> > > > The disadvantage of the first approach is that maxLength and minLength are non-transferrable length restriction mechanisms. They are not something that could be used directly by Schematron or HTML5. > > The disadvantage of the second approach is that an application would require sophistication to parse the regex to understand its length constraints. > > > The advantage of the second approach is that the constraints are completely contained within the regex. Thus, the regex could, with little or no modification, be lifted and dropped into an XSLT regex expression or a Schematron regex expression or an HTML5 regex expression. > > The advantage of the first approach is that it is easier for a machine to determine the simpleType's length restrictions. > > > What other advantages and disadvantages do each approach have? Which approach do you recommend? Why? > > /Roger > >
Received on Monday, 3 January 2011 20:10:10 UTC