Error in the Errata from Alexandru Berlea on 2000-08-24 (xml-editor@w3.org from July to September 2000)

From: Alexandru Berlea <aberlea@psi.uni-trier.de>
Date: Thu, 24 Aug 2000 17:20:00 +0200
To: xml-editor@w3.org
Message-ID: <39A53D20.EC73EF24@psi.uni-trier.de>

Hello,

please consider the following, which I found to be an error in the XML
1.0 Specification Errata, E44.

If the following holds: "The notation ## is used to denote any byte
value except 00." ,
then the following quoted lines from the Errata:

>      00 3C ## ##,
>      00 25 ## ##,
>      00 20 ## ##,
>      00 09 ## ##,
>      00 0D ## ## or
>      00 0A ## ##: Big-endian UTF-16 or ISO-10646-UCS-2. Note that, absent
>                   an encoding declaration, these cases are strictly
>                   speaking in error.
>
should be

"   00 3C 00 ##,
     00 25 00 ##,
     00 20 00 ##,
     00 09 00 ##,
     00 0D 00 ## or
     00 0A 00 ##: Big-endian UTF-16 or ISO-10646-UCS-2. Note that,
absent
                  an encoding declaration, these cases are strictly
                  speaking in error."

Otherwise a file beginning with the bytes 00 3C 00 3F, which accordingly
to the original XML 1.0 Specification would correctly be interpreted as
UTF-16, big-endian would fall now, accordingly to the Errata in the
cathegory "other", i.e. would be wronlgy interpreted as UTF-8.

Analogously the lines

>      3C 00 ## ##,
>      25 00 ## ##,
>      20 00 ## ##,
>      09 00 ## ##,
>      0D 00 ## ## or
>      0A 00 ## ##: Little-endian UTF-16 or ISO-10646-UCS-2. Note that, absent
>                   an encoding declaration, these cases are strictly
>                   speaking in error.
>

should be in my opinion

"    3C 00 ## 00,
     25 00 ## 00,
     20 00 ## 00,
     09 00 ## 00,
     0D 00 ## 00 or
     0A 00 ## 00: Little-endian UTF-16 or ISO-10646-UCS-2. Note that,
absent
                  an encoding declaration, these cases are strictly
                  speaking in error."

Is this right?

Thank you.

Regards,

Alexandru Berlea.

Received on Thursday, 24 August 2000 11:20:45 UTC