RE: back references in regexen

I'd like to voice some support for Tobi.  I agree with all of his
arguments - back references are easy to implement, are available in
every regex library I have used over the past several years, and
extremely useful.  Some tasks become substantially more difficult
without back references.

 

I did a survey of regular expressions in use by customers of mine in
real-world applications.  Every single customer had used back references
at least once, and approximately 20% of the regular expressions in use
contained at least one back reference.

 

I'm not sure the argument to "match what is used in XML Schema" holds a
lot of water.  XML Schema uses the subset of regex which is appropriate
for the task at hand.  One does not normally need back references to
define an acceptable pattern match for a type.  But, when regex is used
more generally, back references provide an enormous amount of power.

 

They also lead to better optimizations and smaller, tighter code.
Without back references, dealing with two different quote delimiters
(single vs. double) becomes cumbersome, especially if the pattern
matching within the quotes is complex.  In effect, a complex pattern
needs to be repeated twice, once for the single quote and again for the
double quote.  (Note that this is a good example of why XML Schema would
perhaps be better off with back references, too.)

 

I believe many real-world users will be unpleasantly surprised if back
references are not supported.  And I am speaking as an implementer of
XQuery - I am not bothered by the 'extra work' I'll need to do to
implement and test back references.

 

So maybe back references could be a dreaded "optional" feature, provided
the syntax was standardized (which, of course, it already is).

 

            My $0.02,

            -Todd

 

-----Original Message-----
From: public-qt-comments-request@w3.org
[mailto:public-qt-comments-request@w3.org] On Behalf Of Ashok Malhotra
Sent: Friday, June 13, 2003 9:14 AM
To: public-qt-comments@w3.org; tobiasreif@pinkjuice.com
Subject: Re: back references in regexen

 

Dear Tobi:

Thank you for your note dated May 21, 2003 re. back references in
Regular expressions:

 
<http://lists.w3.org/Archives/Public/public-qt-comments/2003May/0288.htm
l>
http://lists.w3.org/Archives/Public/public-qt-comments/2003May/0288.html

The Functions and Operators taskforce discussed your request during its
telcon on June 12, 2003.

While the functionality you suggest is interesting, it is complex and
the taskforce felt that the additional complexity was not warranted at
this point.  Also, we are trying to stay close the regular expression
facilities in XML Schema which do not have this functionality.

Please let us know if you are satisfied with this response. 

All the best, 

Ashok Malhotra on behalf of the Functions and Operators taskforce

Received on Friday, 13 June 2003 12:18:50 UTC