- From: Liam R. E. Quin <liam@fromoldbooks.org>
- Date: Tue, 21 Jan 2025 03:14:21 -0500
- To: public-ixml@w3.org
On Tue, 2025-01-21 at 09:01 +0100, Nico Verwer (Rakensi) wrote: > > > > functionCall: !(keyword, "("), QName,-"(", arguments, -")". > > keyword: "if" | "return" | .... This is a case where the lack of explicit tokenization causes pain - as long as such cases are rare it's worth avoiding the hassle of course. Some musings from regular expression land... For the keyword case one could conceivably have a construct that must match a boundary. Some regular expression languages have \b or \< and \> to match a boundary between word and non-word (\b is symmetrical, \< matches start and \> end of word). The construct does not consume any input, does not match a character, you can't write \<* in such a language, but you could then say functionCall: !(keyword \>), S*, QName ... A more general mechanism to insert a boundary marker in one rule and match it in another seems like too much mechanism to all. Some regexp languages have lookahead assertions, such that election (?!disaster) matches election but not electiondisaster, but does not consume the following word disaster. word: [a-zA-Z]+ >!nonword could match Latin alphabet characters up to a nonword; this would happen without the >!nonword, but there's no backtracking here, and now you can subtract keyword reliably, functionCall: (word - keyword) S* "(" ... -- Liam Quin, https://www.delightfulcomputing.com/ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Barefoot Web-slave, antique illustrations: http://www.fromoldbooks.org
Received on Tuesday, 21 January 2025 08:14:26 UTC