- From: Steven Pemberton <steven.pemberton@cwi.nl>
- Date: Mon, 19 Sep 2022 14:52:58 +0000
- To: M Joel Dubinko <micah@dubinko.info>, public-ixml@w3.org
Are you sure there aren't two completed parses for the trailing s, one empty and one including the NL? I would also expect two parses for the root, one including the NL and one not. Steven On Sunday 18 September 2022 05:56:48 (+02:00), M Joel Dubinko wrote: > Taking through a technical issue for an audience of fellow implementers. Others may find this thread exceedingly dull. :) > > Hopefully writing this out should be enough for me to figure out the problem. (But if you're reading this, that means I hit 'Send' and thus could benefit from your perspective. Alan Kay is alleged to have said that a change of perspective is worth 80 IQ points...) > > In this example, the top rule I'm using is: > > ixml: s, rule++RS, s. > > Where s is optional whitespace. And the rest of the grammar is a reasonably close facsimile of the actual grammar; any minor exceptions shouldn't matter for what follows. > > The input is: > > doc = "a", "b". > > Including a newline at the end, that's 16 distinct Unicode characters presented as input. > > Here is the last few lines of completed traces of the parse: (numbers after @ are the position after parsing that item. One-past-the-end, a la C++ iterators) > > 430) 0:15👉 rule=( ---f-option78@0, name@3, s@4, -'='@5, s@6, -alts@14, -'.'@15 • ) > 433) 15:15👉 --f-option72=( • ) > 438) 15:16👉 whitespace=( -' > '@16 • ) > 439) 15:15👉 --f-star71=( ---f-option72@15 • ) > 443) 16:16👉 --f-option77=( • ) > 441) 0:15👉 --f-plus-sep70=( rule@15, ---f-star71@15 • ) > 446) 16:16👉 --f-star76=( ---f-option77@16 • ) > 451) 15:15👉 --f-option74=( • ) > 448) 15:16👉 --f-plus75=( whitespace@16, ---f-star76@16 • ) > 453) 15:15👉 --f-star73=( ---f-option74@15 • ) > 454) 15:16👉 RS=( ---f-plus75@16 • ) > 455) 15:15👉 s=( ---f-star73@15 • ) > 459) 16:16👉 --f-option78=( • ) > 457) 0:15👉 ixml=( s@0, ---f-plus-sep70@15, s@15 • ) > > From the bottom up, the final trace 457) looks like the parse is in good shape, I think. From character positions 0:15 we have a complete match on the top ixml rule (which aligns with the the 'rule' match at 430), also at character positions 0:15). But to call it a success, the code that processes this trace is looking for a complete 0:16 match on the root rule 'ixml'. > > There _is_ a 'whitespace' match 438) on a newline spanning 15:16, but since the ixml rule ends with optional whitespace, the rule gets marked as complete as seen here. Downstream code then errors out, since not all of the input was matched by the root rule. > > And 454) is interesting. There's still a leftover task on 'ixml' trying to see if it can keep going. > > The solution is at the tip of my cortex, but I'm just not seeing it. > > Thoughts? > > j > > >
Received on Monday, 19 September 2022 14:53:22 UTC