BOM clarification

The minutes (and Norm's action) show me that I was unclear today.

    Michael: I think that should be a "should" not a "must".

What I meant to say was that I think a MUST makes sense for a BOM
appearing at the beginning of an input grammar, but a SHOULD makes sense
for a BOM appearing at the beginning of an input string, if we wish to
preserve the idea that one could in principle parse binary data with
ixml.

I note in passing that while we think that empirically the unexpected
appearance of BOMs only occurs in UTF8 data streams, I think that our
rule can be more general:  if a BOM appears as the first character in
any data stream, it is either definitely (in the case of an input
grammar) or almost certainly (in the case of an input string) not
intended as data and better ignored -- that holds true for any
encoding including UTF-16 not just UTF-8.  (It's Norm's action to draft
this, not mine, so this is just a suggestion.)

-- 
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com

Received on Tuesday, 9 May 2023 15:02:19 UTC