[Bug 1363] please clarify the role of flags, collation tables, mapping tables

http://www.w3.org/Bugs/Public/show_bug.cgi?id=1363





------- Additional Comments From mike@saxonica.com  2005-05-11 08:38 -------
We have always taken the view that 

(a) regular expression processing is not intended for analysis of natural
language text (that's the job of the free-text search capabilities). It's
intended for low level manipulation of character patterns, and if the user wants
to use it for example to parse dates, they need to define exactly what the
formats of the dates to be handled are. Thus a pattern such as [a-j] should mean
exactly what it says (a set of 10 Unicode codepoints), and should not include
a-umlaut, for example, just because the user is in a locale where a-umlaut
collates between a and j. If the user wants to include a-umlaut then they should
say so.

(b) we're not prepared to push the state of the art in regular expression theory
or practice. We're picking up proven technology here, it would be too risky to
try to invent anything new.

Michael Kay (personal response)

Received on Wednesday, 11 May 2005 08:38:30 UTC