Error states in the spec from Norm Tovey-Walsh on 2021-11-10 (public-ixml@w3.org from November 2021)

From: Norm Tovey-Walsh <norm@saxonica.com>
Date: Wed, 10 Nov 2021 18:40:03 +0000
To: public-ixml@w3.org
Message-ID: <m2wnlfsv0l.fsf@saxonica.com>

Hello,

Persuant to my action, I’ve re-read the ixml specification with an eye
towards identifying error states. It’s a little harder to identify them
than I expected. The spec (quite reasonably) states conditions that must
be true, but doesn’t always describe how they might not be true.

For example, an opening quote mark that doesn’t have a matching closing
quote mark is obviously an “unterminated string” error. But the spec
doesn’t actually say that, I don’t think.

I expect that a little implementation experience will bring more kinds
of errors to light. In the meantime, there are some in the
specification:

1. An encoded character that is outside of the Unicode code-point range
2. An encoded character that is a noncharacter or surrogate code point
3. Names that must be serialized but are not XML names
4. Non-conforming grammars must be rejected
5. There must be exactly one rule for every non-terminal
6. Terminal symbols must not be marked as attributes
7. In a character class, the class must be defined in Unicode
8. Unterminated strings
9. Other syntax errors in the grammar
10. It is possible for an implementation to run out of memory or
    otherwise be unable to complete a parse

                                        Be seeing you,
                                          norm

--
Norm Tovey-Walsh
Saxonica

Received on Wednesday, 10 November 2021 18:53:00 UTC