[F&O] regular expressions: non-capturing groups

Hi

A while ago I wrote some XSLT2 to add syntax markup to code listings
(while transforming DocBook to XHTML) [1].

This basically worked well (enabling syntax highlighting), until I hit
a road block: I noticed that when two keywords appear in direct
succession (just one, shared delimiter), the second one won't be
matched (and thus won't be marked up) [2]. My regex matches keywords
by specifying any possible delimiters, often including whitespace. If
there's just one space after the word, it will be consumed with the
first match. Thus the second keyword can't be matched; the delimiter
before it has been matched/consumed already. After having realized
that the syntax markup respectively syntax highlighting doesn't really
work (yet), I had to disable most of it, which is unfortunate.

Is there currently a way to specify regex groups which must be matched
but whose match isn't captured (consumed)?

If the current draft does not make this possible, I'd ask you to
consider the addition of the feature, eg to

  http://www.w3.org/TR/xpath-functions/#regex-syntax

It's called non-capturing groups in existing regex implementations I
think. The syntax could look like this:

  (?:)
  (?:notcaptured)

The group's content is prefixed with "?:".

Also see

  http://www.google.com/search?q=%22non-capturing%22+perl
  http://www.google.com/search?q=%22non-capturing%22+java

etc, eg

  http://piglet.uccs.edu/~cs301/perl/re.htm
  http://javaalmanac.com/egs/java.util.regex/NoGroup.html

I created a simple use case example

  http://www.pinkjuice.com/xslt2/non_capturing/

It might be that I miss a straight-forward way to achieve what I want.

Perhaps off-topic: I don't understand why in some places a space gets
added by the example transformation (it doesn't happen with the actual
syntax markup XSLTs). For example, in the input, there are two
spaces between the two keywords in the first line, and one space
inbetween in the second line. In the output there are three
respectively two spaces. I'd appreciate any feedback on this, on- or
off-list.

Thanks in advance for considering my potential request,
Tobi

[1]

http://www.pinkjuice.com/howto/vimxml/about.xml#colophon
http://www.pinkjuice.com/howto/vimxml/xslt/tinydbk2xhtml/
http://www.pinkjuice.com/howto/vimxml/xslt/tinydbk2xhtml/markup_syntax.xslt
http://www.pinkjuice.com/howto/vimxml/xslt/tinydbk2xhtml/syntax_markup_shared.xslt
http://www.pinkjuice.com/howto/vimxml/xslt/tinydbk2xhtml/markup_shell.xslt
http://www.pinkjuice.com/howto/vimxml/xslt/tinydbk2xhtml/markup_ruby.xslt
etc

[2] With the original XSLTs:

Test snippet added to ch04.xml:

def true
def  true

Result in moresetup.xml:

<code class="keyword">def</code> true
<code class="keyword">def</code>  <code class="keyword">true</code>

-- 
to
  bi
    as
  re
if

Received on Monday, 22 November 2004 18:03:49 UTC