- From: Alexandru Berlea <aberlea@psi.uni-trier.de>
- Date: Thu, 24 Aug 2000 17:20:00 +0200
- To: xml-editor@w3.org
- Message-ID: <39A53D20.EC73EF24@psi.uni-trier.de>
Hello,
please consider the following, which I found to be an error in the XML
1.0 Specification Errata, E44.
If the following holds: "The notation ## is used to denote any byte
value except 00." ,
then the following quoted lines from the Errata:
> 00 3C ## ##,
> 00 25 ## ##,
> 00 20 ## ##,
> 00 09 ## ##,
> 00 0D ## ## or
> 00 0A ## ##: Big-endian UTF-16 or ISO-10646-UCS-2. Note that, absent
> an encoding declaration, these cases are strictly
> speaking in error.
>
should be
" 00 3C 00 ##,
00 25 00 ##,
00 20 00 ##,
00 09 00 ##,
00 0D 00 ## or
00 0A 00 ##: Big-endian UTF-16 or ISO-10646-UCS-2. Note that,
absent
an encoding declaration, these cases are strictly
speaking in error."
Otherwise a file beginning with the bytes 00 3C 00 3F, which accordingly
to the original XML 1.0 Specification would correctly be interpreted as
UTF-16, big-endian would fall now, accordingly to the Errata in the
cathegory "other", i.e. would be wronlgy interpreted as UTF-8.
Analogously the lines
> 3C 00 ## ##,
> 25 00 ## ##,
> 20 00 ## ##,
> 09 00 ## ##,
> 0D 00 ## ## or
> 0A 00 ## ##: Little-endian UTF-16 or ISO-10646-UCS-2. Note that, absent
> an encoding declaration, these cases are strictly
> speaking in error.
>
should be in my opinion
" 3C 00 ## 00,
25 00 ## 00,
20 00 ## 00,
09 00 ## 00,
0D 00 ## 00 or
0A 00 ## 00: Little-endian UTF-16 or ISO-10646-UCS-2. Note that,
absent
an encoding declaration, these cases are strictly
speaking in error."
Is this right?
Thank you.
Regards,
Alexandru Berlea.
Received on Thursday, 24 August 2000 11:20:45 UTC