- From: LdBeth <andpuke@foxmail.com>
- Date: Tue, 28 Jan 2025 10:13:56 -0600
- To: David Birnbaum <djbpitt@gmail.com>
- Cc: "Liam R. E. Quin" <liam@fromoldbooks.org>, ixml <public-ixml@w3.org>
>>>>> In <B160DE28-00C5-4A04-8841-313B5F72AB5F@gmail.com> >>>>> David Birnbaum <djbpitt@gmail.com> wrote: > [1 <text/html; utf-8 (quoted-printable)>] > Dear Liam (cc public-ixml), > Thanks, Liam, for pointing this out. Yes, ixml will be available in > XPath 4, making it easy to apply it to isolated parts of a > document. It can already be used that way inside Saxon with the help > of CoffeeSacks. Since you are using CoffeePot, which does an extension to be used to handle the problem that ixml itself does not provide ability to resolve the ambiguity which is well suited for this problem. https://docs.nineml.org/current/coffeepot/bk02ch06.html#choose-xpath But a grammar without ambiguity can speed up parsing a lot and for other reasons I would only use the extension as last resort. > This leaves me still wondering whether there are rules of thumb for > choosing between using regex (e.g., analyze-string()) and using ixml > when both are available. I approached this task in both ways and I > find my regex-in-XSLT solution more legible and easier to > understand, but I don’t know how much my perception of legibility or > ease of understanding is about differences between the technologies > vs differences in my experience and familiarity with them. “Use what > you know” might get the particular job done more quickly, but > sometimes learning something new pays off over time. Regular expression *with* conditional text replacement is very powerful, even if the regex library been used itself is limited to regular languages (without backtracking/lookup ahead etc), with control flow it can still be used to simulate these advanced PCRE features with ease. On the other side, ixml can be used to specify a grammar which otherwise would need a lot of backtracking to work. But the current standard draft still leave a gap between what a full blown regexp library could do, that concepts like negative match would need to be translated by hand (or by machine) to be used, and the inherient declarative natural of ixml makes doing tasks that would otherwise require impreative style programming difficult. LdBeth
Received on Tuesday, 28 January 2025 16:21:19 UTC