RE: XHTML Modularization 1.1: Lazy datatype patterns in XML Schema

Dear HTML editors,
Here is a small update to my previous e-mail
[http://lists.w3.org/Archives/Public/www-html-editor/2006JulSep/0015.html] suggesting better patterns in XML Schemas for some
XHTML Modularization 1.1 datatypes in
[http://www.w3.org/TR/2006/WD-xhtml-modularization-20060705/SCHEMA/xhtml-datatypes-1.xsd].

An updated "xhtml-datatypes-1.xsd" with my propositions is also attached to this e-mail.


ContentType: "A media type, as per [RFC2045]"
------------

- Short version:

 <xs:pattern value="[^/ ;,=]+/[^/ ;,=]+(;\s*[^/ ;,=]+=([^/ ;,=]+|&quot;([^&quot;\\]|\\\\|\\&quot;)*&quot;))*"/>


The update is on the "quoted string" part (as per RFC 2822, without optional comments):
 &quot;([^&quot;\\]|\\\\|\\&quot;)*&quot;

Details:

 "        # quotation mark
 (        #
  [^"\\]  ## any character but a quotation mark " or an anti-slash \
  |       ## or
  \\\\    ## an escaped anti-slash \\
  |       ## or
  \\"     ## an escaped quotation mark \"
 )*       # the content of a quoted string can
          # be 0 or more characters
 "        # quotation mark


- Long version:

 <xs:pattern
value="([xX][-.][!#$%&amp;'*+-.0-9A-Z\\^_`a-z{|}~]+|[a-zA-Z]{4,})/([xX][-.][!#$%&amp;'*+-.0-9A-Z\\^_`a-z{|}~]+|[a-zA-Z0-9._+-]+)
(;\s*[!#$%&amp;'*+-.0-9A-Z\\^_`a-z{|}~]+=([!#$%&amp;'*+-.0-9A-Z\\^_`a-z{|}~]+|&quot;([^&quot;\\]|\\\\|\\&quot;)*&quot;))*"/>

In addition to the update on "quoted string" reported above, an anti-slash escaping was missing in my token definition (as per
RFC 2045):

 [!#$%&'*+-.0-9A-Z\\^_`a-z{|}~]+


Cordially,
Alexandre
http://alexandre.alapetite.net

Received on Friday, 14 July 2006 00:13:38 UTC