[Bug 5348] [F&O] Back-references: "sufficiently many"

http://www.w3.org/Bugs/Public/show_bug.cgi?id=5348


mike@saxonica.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED




------- Comment #3 from mike@saxonica.com  2008-04-22 17:13 -------
On 22 April 2008 the WG decided as follows: a backreference \11 that occurs
after the 11th open parenthesis, but before the closing paren that matches the
11th open paren, should be an error. The editor was asked to propose wording to
achieve this. 

The proposed change affects text that currently appears in Erratum FO.E4:

http://www.w3.org/XML/2007/qt-errata/xpath-functions-errata.html#E4

The proposed revision of that paragraph, with changes highlighted using (*...*)
for new text and (:...:) for deleted text, and (^...^) for moved text, is:

Back-references are allowed outside a character class expression. A
back-reference is an additional kind of atom.  The construct \N where N is a
single digit is always recognized as a back-reference; if this is followed by
further digits, these digits are taken to be part of the back-reference if and
only if the resulting number NN is such that the back-reference is preceded by
(* NN or more opening parentheses *) (:sufficiently many capturing
subexpressions:). (^The regular expression is invalid if (*a back-reference
refers to a (:this:) subexpression (*that*) does not exist or (*whose*)(:if
its:) closing right parenthesis occurs after the back-reference.^) 

Continue with unchanged text, moved into a new para:

A back-reference matches the string that was matched by the nth capturing
subexpression within the regular expression, that is, the parenthesized
subexpression whose opening left parenthesis is the nth unescaped left
parenthesis within the regular expression.  For example, the regular expression
('|").*\1 matches a sequence of characters delimited either by an apostrophe at
the start and end, or by a quotation mark at the start and end.

Received on Tuesday, 22 April 2008 17:13:51 UTC