- From: Joel Yliluoma <bisqwit@iki.fi>
- Date: Mon, 16 Jan 2006 15:04:22 +0200 (EET)
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- cc: www-archive@w3.org
On Mon, 16 Jan 2006, Bjoern Hoehrmann wrote: > I since ran into some other issues. It'd be good to have these at least > in the documentation if not fixed (so others might look at it and con- > tribute a patch or two). Thanks. > A re like x{0,4} is rewritten to x{,4}; the latter syntax is not widely > supported, e.g. Perl would not treat this as quantifier but as literal. I didn't know this. Thanks, it will be fixed in the next version, 1.1.1. > A re like [a-z]+|[a-z]+ is rewritten to (?:|)[a-z]+; this should really > be [a-z]+ instead. This also will be fixed in the next version, 1.1.1. > A re like foo|[a-z]+ comes out as (?:foo|[a-z]+); this could be further > optimized to simply [a-z]+. This is like "Choice counting" which is > already listed, but it'd be good to have this example in the docs, I > think. The Perl module Regexp::Optimizer reduces (?:aa|a)b to aa?b but > does not do this for (?:foo|[a-z]+). Yes, this is the choice counting problem. I should think of a way to make the program check if an alternative is a subset of another alternative, and thus combine them if they are. > It'd be handy to have control about what . is equivalent to, e.g. in > Perl with the 's' modifier it's really any character, in XML Schema's > regular expression language it's [^\r\n], etc. Yes, it'd be nice... -- Joel Yliluoma http://iki.fi/bisqwit/
Received on Monday, 16 January 2006 13:02:00 UTC