> If an implementation's I/O routines don't handle BOMs, then surely an
> implementor can work around that with an ad hoc routine when opening a
> stream?
>
> Presumably I'm missing something. What is it?
My impression is that I/O subsystems consume the BOM on UTF-16+ systems
because they need it to work out the byte order. They don’t need it on
UTF-8, and it’s discouraged on UTF-8, so they ignore it and pass it
through.
I think ignoring the BOM on input grammars is perfectly reasonble and we
should say that.
Ignore the BOM on input documents is a little harder because what if I
am parsing a non-text document that happens to begin #FEFF. But I’d let
that be up to the discretion of the implementor because I think it’s
very unlikely.
My plan is an option on CoffeePot to ignore the BOM if the input file is
UTF-8 with the default set to “true”.
Be seeing you,
norm
--
Norm Tovey-Walsh
Saxonica