W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > August 2012

RE: forbiddenCharacters data category - related to [ACTIOn-189]

From: Yves Savourel <ysavourel@enlaso.com>
Date: Mon, 27 Aug 2012 14:30:57 -0600
To: <public-multilingualweb-lt@w3.org>
Message-ID: <assp.05862b18b0.assp.058666d8ee.00c701cd8492$dd63f6a0$982be3e0$@com>
> But I don't think we should disallow [A-[B]] as
> this syntax is available in XML Schema nad XPath 
> 2.0/XQuery 1.0 -- there are plenty of existing 
> implementations around.
> Moreover other languages offer similar syntax. For 
> example in Java you can map this to [A&&[^B]] if 
> I'm not mistaken.

But, as you show, such character class subtraction doesn't have a common syntax across different engines, or it's not supported in other (like JavaScript as far as I know). That's why I think we want to avoid picking the all the features of one engine. For example we wouldn't allow \d, \w, etc. either because it's not interoperable.

On the other hand, the following simpler subset is, I think, fully interoperable:

- The set is defined between square brackets ('[', and ']').
- One or more operators '-' MAY be used to indicate ranges.
- The prefix '^' MAY be used just after the opening bracket to invert the selection.
- The characters '[', ']', '-', '^' and '\' MUST be prefixed with '\' when used as literal.

Received on Monday, 27 August 2012 20:31:25 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:31:51 UTC