- From: David Carlisle <davidc@nag.co.uk>
- Date: Mon, 7 Jan 2002 09:58:12 GMT
- To: xsl-list@lists.mulberrytech.com
- CC: www-xml-query-comments@w3.org, Jeni Tennison <jeni@jenitennison.com>
> Most regular expression languages don't find overlapping matches, do > they? It seems to add a lot of extra complexity if they do. No, but then they don't return a list of all matches either. In emacs for example I regexp-search to the first occurrence then can choose to restart the search from the end of the found text or the beginning (or anywhere else). In xpath as currently spec'd I'm forced to find all the non overlapping occurrences in the entire text even if I only want to find the first, make a replacement and then start again searching in the new partly edited text. > In the description of xf:replace() it says: > > The value of $repval may use the standard regular expression syntax > of "$N" oops I just missed that (there's a lot of documents to skim over:-) however I don't think that this is particularly useful. (see below) > I don't think that the xf:match() function needs to return the > positions of the subexpressions, or the subexpressions themselves, > because that functionality could be achieved via xf:replace(). For > example, to find out what string was matched by the first > subexpression you could just use "$1" as the replace value. Looking at why one needs regexp in an XML query language, it is usually to infer structure into otherwise unstructured (by XML) input. ie to "UP TRANSLATE" in omnimark parlance. Here's a snippet of an omnimark script I had lying around: TRANSLATE "'" (letter+ ) => found-text "'" OUTPUT "<e>%x(found-text)</e>" this changes 'abc' to <e>abc</e> You could of course do something similar with perl. However you could not do this with xf:replace. perl and omnimark you can add XML markup as a string, as their underlying data structures are not as tree oriented as Xpath. In Xpath you can't do that. So a replace function that only lets you replace one set of unstructured input by some more unstructured output is not particularly useful. If however the match function returned the sequence of substrings matched or equivalently a sequence of the match positions, then the string could be broken up and nodes added as required. Actually it might be interesting (and more in the xpath style) to allow omnimark style named variable binding (the found-text in the above) within the serach string which would then be accessed by normal xpath xpath variable reference, $found-text, in any functions triggered by the replacement code. David _____________________________________________________________________ This message has been checked for all known viruses by Star Internet delivered through the MessageLabs Virus Scanning Service. For further information visit http://www.star.net.uk/stats.asp or alternatively call Star Internet for details on the Virus Scanning Service.
Received on Monday, 7 January 2002 04:59:10 UTC