- From: <bugzilla@wiggum.w3.org>
- Date: Sat, 05 Jan 2008 11:00:20 +0000
- To: public-qt-comments@w3.org
- CC:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=5348 Summary: [F&O] Back-references: "sufficiently many" Product: XPath / XQuery / XSLT Version: Recommendation Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: Functions and Operators AssignedTo: mike@saxonica.com ReportedBy: mike@saxonica.com QAContact: public-qt-comments@w3.org In the specification for back-references in regular expressions (repeated unchanged in Erratum E4), we use the phrase: <quote> The construct \n where n is a single digit is always recognized as a back-reference; if this is followed by further digits, these digits are taken to be part of the back-reference if and only if the back-reference is preceded by sufficiently many capturing subexpressions. </quote> So what happens if the regular expression uses \11, and it is preceded by 12 capturing subexpressions, but there is no subexpression 11 because the closing paren for group 11 has not yet been encountered? That is: (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11(12)(13)\11) Is \11 intepreted as a reference to the non-existent group 11, or as a reference to group 1 followed by the digit 1? I think it should be the latter. This involves changing the text to: ...these digits are taken to be part of the back-reference if and only if the back-reference is preceded by a capturing subexpression with the relevant number (so \12 is treated as a reference to captured subexpression 12 if the back-reference is preceded by the closing parenthesis that matches the 12th opening parenthesis). The error condition described in erratum E4 as: <quote> The regular expression is invalid if this subexpression does not exist or if its closing right parenthesis occurs after the back-reference. </quote> can then occur only for a single-digit back-reference. Editorially, it might be appropriate to reorder the sentences in the resulting paragraph.
Received on Saturday, 5 January 2008 11:00:24 UTC