- From: <bugzilla@wiggum.w3.org>
- Date: Wed, 17 Aug 2005 13:36:50 +0000
- To: public-qt-comments@w3.org
- Cc:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=1851 ------- Additional Comments From mike@saxonica.com 2005-08-17 13:36 ------- First a meta-comment: as you observe, many specifications of regular expression semantics are amazingly informal, and we already do very well compared with other languages such as Perl and Java. It's good to get the specification precise, but there may be a point at which it's better to leave things a little fuzzy at the edges to allow implementors to do whatever the underlying library does. If users are managing to write Perl and Java without precise guarantees of behaviour in edge cases, perhaps this isn't a problem we need to solve. There's also a danger that if we overspecify, some of the implementors who are less concerned about 100% conformance may simply ignore us. You're right that captured subgroups now apply not only to replace(), but also to back-references (and also to xsl:analyze-string in XSLT). My expectation is that the captured substring for a subexpression that's matched zero times is the zero-length string. In XSLT we say this quite explicitly (see http://www.w3.org/TR/xslt20/#regex-group). This also applies to cases such as ((a)|(b)) Michael Kay
Received on Wednesday, 17 August 2005 13:37:05 UTC