- From: Tobias Reif <tobiasreif@pinkjuice.com>
- Date: Tue, 23 Nov 2004 10:50:59 +0100
- To: Svgdeveloper@aol.com
- Cc: public-qt-comments@w3.org
Hi Andrew: I'll try to answer your questions below. If you need further info, please consider contacting me offlist. WGs, spec editors and authors, et al: The project and example I describe is just one potential use-case. Non-capturing groups would be generally useful, for many different use-cases and many users, no matter if they're required for this use-case or not. The functionality can be considered to be inside the explicitly stated scope of the spec: http://www.w3.org/TR/xpath-functions/#regex-syntax : "7.6.1 Regular Expression Syntax The regular expression syntax used by these functions is defined in terms of the regular expression syntax specified in XML Schema (see [XML Schema Part 2: Datatypes]), which in turn is based on the established conventions of languages such as Perl. However, because XML Schema uses regular expressions only for validity checking, it omits some facilities that are widely-used with languages such as Perl. This section, therefore, describes extensions to the XML Schema regular expressions syntax that reinstate these capabilities." I don't know whether non-capturing groups are widely-used (it depends on the POV I suspect), but AFAIK they're an important part of popular regex implementations such as Perl's. Since the feature is part of popular regex implementations it probably wouldn't add more than a few lines to most XSLT2 implementations. It also would add just a few lines to the spec. Thanks in advance for considering my late request. On Mon 2004-11-22 Svgdeveloper@aol.com wrote: > Do you really need to capture the whitespace character(s) and do an > xsl:copy-of of it/them? As I said, possible delimiters can include white space (they're not always just whitespace). You could also check the linked XSLT files for actual examples. Here are the input files containing the program listings which are to be marked up: http://www.pinkjuice.com/howto/vimxml/docbook/ Here's an example of the output: http://www.pinkjuice.com/howto/vimxml/tasks.xml#creatingdocuments (eg the first XML listing; all other syntax markup currently is disabled) > Would a simple replace with a single space character before and > after work? I don't really understand what you mean here. > If so the following looks, after a brief look at your use case, to be a > possible solution: > > regex="\s(((while|true|if|else|end)\s*)+)\s" > > If that doesn't work Thanks for taking the time trying to help. Unfortunately I don't understand what you mean, or how it would work. You could either detail all changes I should make to the example XSLT, or (probably much more efficient) apply them yourself and see if your proposed solution works (I can't do this because I don't know which changes you're proposing, except for the new regex). If it works for the following, please send it (on- or off-list). Here are the input strings from the example with the desired output: while true <span class="keyword">while</span> <span class="keyword">true</span> while true <span class="keyword">while</span> <span class="keyword">true</span> Multiple spaces must not be collapsed (and no (space or other) characters should be added). Just the keyword is to be marked up, excluding any delimiter (because the keyword might get styled to be underlined, or the delimiter might be non-whitespace). Here, nothing shold be matched: whiletrue Other possible delimiters include parentheses: (not included in the current example regex) (while true end) (<span class="keyword">while</span> <span class="keyword">true</span> <span class="keyword">end</span>) Before a keyword there can be nothing (no character): ^ , and also after it $ . > perhaps you could explain in greater detail what it is you want to > match. Lots of different things, as typical when applying syntax markup to various programming languages. Examples: $(command) "literal string" object.method # object and method names share the dot as delimiter object.method(args) function_call(args) # a recognized function `command` if true; then true; end (if x true else false end) # etc etc I don't want to use XSLT2 (plus regexen) to fully parse complex code (I'd have to build a parser for this), but to apply simple markup to basic stuff such as keywords. The linked input and XSLT files actually answer your question much better. Currently, when I have def true inside a Ruby program in ch04.xml, I get <code class="keyword">def</code> true instead of <code class="keyword">def</code> <code class="keyword">true</keyword> in the output (moresetup.xml). But I thought it would simplify things when I create and post a short example. Tobi -- to bi as re if
Received on Tuesday, 23 November 2004 09:51:05 UTC