Re: The NOT construct

On Tue, 2025-01-21 at 09:01 +0100, Nico Verwer (Rakensi) wrote:
> > 
> > functionCall: !(keyword, "("), QName,-"(", arguments, -")".
> > keyword: "if" | "return" | ....

This is a case where the lack of explicit tokenization causes pain - as
long as such cases are rare it's worth avoiding the hassle of course.

Some musings from regular expression land...

For the keyword case one could conceivably have a construct that must
match a boundary. Some regular expression languages have \b or \< and
\> to match a boundary between word and non-word (\b is symmetrical, \<
matches start and \> end of word). The construct does not consume any
input, does not match a character, you can't write \<* in such a
language, but you could then say

functionCall: !(keyword \>), S*, QName ...

A more general mechanism to insert a boundary marker in one rule and
match it in another seems like too much mechanism to all.

Some regexp languages have lookahead assertions, such that
  election (?!disaster)
matches election but not electiondisaster, but does not consume the
following word disaster.

word: [a-zA-Z]+ >!nonword

could match Latin alphabet characters up to a nonword; this would
happen without the >!nonword, but there's no backtracking here, and now
you can subtract keyword reliably,

functionCall: (word - keyword) S* "(" ...


-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org

Received on Tuesday, 21 January 2025 08:14:26 UTC