- From: <bugzilla@wiggum.w3.org>
- Date: Wed, 11 May 2005 08:38:26 +0000
- To: public-qt-comments@w3.org
- Cc:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=1363 ------- Additional Comments From mike@saxonica.com 2005-05-11 08:38 ------- We have always taken the view that (a) regular expression processing is not intended for analysis of natural language text (that's the job of the free-text search capabilities). It's intended for low level manipulation of character patterns, and if the user wants to use it for example to parse dates, they need to define exactly what the formats of the dates to be handled are. Thus a pattern such as [a-j] should mean exactly what it says (a set of 10 Unicode codepoints), and should not include a-umlaut, for example, just because the user is in a locale where a-umlaut collates between a and j. If the user wants to include a-umlaut then they should say so. (b) we're not prepared to push the state of the art in regular expression theory or practice. We're picking up proven technology here, it would be too risky to try to invent anything new. Michael Kay (personal response)
Received on Wednesday, 11 May 2005 08:38:30 UTC