- From: Michael Kay <mike@saxonica.com>
- Date: Mon, 16 May 2011 22:07:55 +0100
- To: xmlschema-dev@w3.org
On 16/05/2011 19:54, Costello, Roger L. wrote: > Hi Folks, > > 1. Is every XML Schema validator guaranteed to support the same set of Unicode characters? Firstly, let's assume we are talking about conformant XSD processors. There are many that aren't conformant, and regex support is a notorious black spot for this. As far as conformant processors are concerned, the spec offers implementors freedom to choose which version of Unicode they will support. So if the definitions of character groups like Nd change from one Unicode version to the next, this may be reflected in differences between schema processors. In practice this is only likely to affect you if you are on the bleeding edge of the Unicode repertoire. > > > 2. Is every version of XML Schema guaranteed to support the same set of Unicode characters as all other versions? See above. > > 3. Does XML determine the set of characters supported by XML Schema? That is, does XML Schema support the set of Unicode characters specified in the XML specification? Yes - but XML itself allows new characters when Unicode adds them. > > 4. If I use this regex in my XML Schema: > > [^0-9]* > > Is there a risk that: > > a. The set of strings described by the regex may vary, depending on the XML Schema validator (or an XML Schema application)? > > b. With different versions of XML Schema (e.g., XML Schema 1.0, XML Schema 1.1) the regex may describe different sets of strings? > No, for a simple regex like this you'll get the same results with every processor. Even non-conformant processors, unless they're pathological. Michael Kay Saxonica
Received on Monday, 16 May 2011 21:08:19 UTC