- From: Norm Tovey-Walsh <norm@saxonica.com>
- Date: Thu, 08 Sep 2022 08:31:10 +0100
- To: Graydon Saunders <graydonish@gmail.com>
- Cc: public-ixml@w3.org
> With the two deletes lumped together. > > My reaction is to look for a way to make the matching non-greedy (I > haven't found one) or to define "word" as "anything but this specific > string". (Fairly sure that's impossible in ixml.) I think it should be possible to rework this so that a priority (a CoffeeFilter extension to iXML) can be assigned to the shorter match, but the priority stuff is still kind of experimental and I wasn’t able to make that work in my first couple of tries. > Is there a way to disambiguate this and guarantee that each delete or > insert will start a block? In principle, you could create a rule that matches sequences of characters that are neither ‘d’, ‘e’, ‘l’, ‘e’, ‘t’, ‘e’ or ‘i’, ‘n’, ‘s’, ‘e’, ‘r’, ‘t’ but in practice I think that’d be much too (too!) large a combinatorial explosion. Another way is to preprocess the input so that the keywords (“delete” and “insert”) can be made unambiguously different from ordinary words. I picked the Line Separator character (U+2028) which is neither a space nor part of a word in your grammar and changed the rules (in what follows, I’ve used “?” instead of the actual line separator because the actual line separator is probably going to get mangled by email transmission): whole = (delBlock|insBlock)+,last,NL. delBlock = -'?', 'delete',space,(word,space)+. insBlock = -'?', 'insert',space,(word,space)+. last = word. -space = [Zs]+. word = [L;P;Nd;Sc]+. -NL = -#A. and the input: ?delete the rest of the line and ?delete line 6 and ?insert “this; and that; and the other thing”. That produces this, unambigously: <whole> <delBlock>delete <word>the</word> <word>rest</word> <word>of</word> <word>the</word> <word>line</word> <word>and</word> </delBlock> <delBlock>delete <word>line</word> <word>6</word> <word>and</word> </delBlock> <insBlock>insert <word>“this;</word> <word>and</word> <word>that;</word> <word>and</word> <word>the</word> <word>other</word> </insBlock> <last> <word>thing”.</word> </last> </whole> Hope that helps. Be seeing you, norm -- Norm Tovey-Walsh Saxonica
Received on Thursday, 8 September 2022 07:48:12 UTC