- From: Pete Cordell <petexmldev@tech-know-ware.com>
- Date: Thu, 4 Jan 2007 09:35:03 -0000
- To: "Tsao, Scott" <scott.tsao@boeing.com>, <xmlschema-dev@w3.org>
Just a note on the greediness mentioned here... My understanding is that the greediness of a regular expression is only an issue when you are capturing the values of sub-patterns within the target data. (e.g. in Perl 'ML123' =~ /ML(\d+)/; captures 123 into $1.) When only doing matching, eventually, if possible, the pattern will be matched irrespective of whether greedy or non-greedy matching is used. Greediness just affects whether the regular expression engine attempts to grab lots of content for a sub-expression in it's first attempt and them back track, or attempts to capture the minimal amount in its first attempt and then forward track (not sure if that's a proper term!). If anyone's opinion differs, please let me know. Pete. -- ============================================= Pete Cordell Tech-Know-Ware Ltd for XML to C++ data binding visit http://www.tech-know-ware.com/lmx (or http://www.xml2cpp.com) ============================================= Original Message From: "Tsao, Scott" A colleague raised this question below regarding the use of XSD pattern facet. Can someone help please? Thanks, Scott Tsao Enterprise Architecture and Integration The Boeing Company -----Original Message----- I'm trying to design a W3C XML Schema type description for an element containing an arbitrary number of quoted strings separated by arbitrary whitespace. The contents of the quoted items are themselves limited to alphanumerics, whitespace, and common punctuation characters, excluding embedded quote characters. (The double quote here is chosen as an arbitrary delimeter and has no special significance.) Example: "abc" "de f" "123_456" "foo bar" "etc." I'm not aware of a "built-in" XML Schema type that can support this representation directly. It also appears that the W3C XML Schema "pattern" facet (allowing the specification of a regular expression for a type format) does not support the "non-greedy" quantifier syntax, e.g., "*?", "+?" that is common in many regular expression engines. Can anyone suggest a regex to define this format without the non-greedy quantifiers, or perhaps an XML Schema representation that can handle this format directly?
Received on Thursday, 4 January 2007 09:40:04 UTC