W3C home > Mailing lists > Public > xmlschema-dev@w3.org > January 2011

Express length constraints in a regex or use maxLength and minLength?

From: Costello, Roger L. <costello@mitre.org>
Date: Mon, 3 Jan 2011 14:44:41 -0500
To: "xmlschema-dev@w3.org" <xmlschema-dev@w3.org>
Message-ID: <9E51F88D5247B648908850C35A3BBB50052DCD43AA@IMCMBX3.MITRE.ORG>
Hi Folks,

I am interested in hearing your thoughts on the advantages and disadvantages of the following two approaches to restricting the length of a string value.

Approach #1: In this simpleType the regex does not restrict the length; instead, the minLength and maxLength facets are used to restrict the length: 

    <simpleType name="English-language-family-name">
        <restriction base="string">
            <minLength value="1" />
            <maxLength value="100" />
            <pattern value="[a-zA-Z' \.-]+" />
        </restriction>
    </simpleType>


Approach #2: Here is the same simpleType except the length restriction is implemented in the regex:

    <simpleType name="English-language-family-name">
        <restriction base="string">
            <pattern value="[a-zA-Z' \.-]{1,100}" />
        </restriction>
    </simpleType>


The disadvantage of the first approach is that maxLength and minLength are non-transferrable length restriction mechanisms. They are not something that could be used directly by Schematron or HTML5.

The disadvantage of the second approach is that an application would require sophistication to parse the regex to understand its length constraints.


The advantage of the second approach is that the constraints are completely contained within the regex. Thus, the regex could, with little or no modification, be lifted and dropped into an XSLT regex expression or a Schematron regex expression or an HTML5 regex expression.

The advantage of the first approach is that it is easier for a machine to determine the simpleType's length restrictions.


What other advantages and disadvantages do each approach have? Which approach do you recommend? Why?

/Roger
Received on Monday, 3 January 2011 19:45:14 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:15:30 GMT