- From: John Lumley <john@saxonica.com>
- Date: Mon, 2 Mar 2026 16:16:03 +0000
- To: Bethan Tovey-Walsh <bytheway@linguacelta.com>
- Cc: ixml <public-ixml@w3.org>
Received on Monday, 2 March 2026 16:16:18 UTC
On 02/03/2026 16:09, Bethan Tovey-Walsh wrote:
> Let's take this grammar fragment:
>
> A: ["a"-"z"]*.
> B: "cat" ; "bat" ; "rat".
> C: A ¬ B.
>
> I think I understand your view of the semantics to be this:
>
> C is an A, unless the entirety of C also matches B, in which case it is a B
No - it's effectively:
C is an A, unless the entirety of C also matches B, in which case not a C
> So if we had the input "caterpillar", we'd get:
>
> <C>
> <A>caterpillar</A>
> </C>
Yes
> and if we had "cat", we'd get:
>
> <C>
> <B>cat</B>
> </C>
No - this would fail, as in the example where element() failed
> and if we had "", we'd get:
>
> <C>
> <A/>
> </C>
Yes, as B would not (yet) have succeeded when A did at the end of input.
> So, in the rule
>
> C: A ¬ B.
>
> we have something rather like
>
> C: A | B.
>
> in that C, if it matches, can be either an A or a B. The ¬ operator is simply a way to indicate that it cannot be*both* A and B.
No - it is more like a set-reduction (difference) operator.
--
*John Lumley* MA PhD CEng FIEE
john@saxonica.com
Received on Monday, 2 March 2026 16:16:18 UTC