- From: Alexandru Berlea <aberlea@psi.uni-trier.de>
- Date: Thu, 24 Aug 2000 17:20:00 +0200
- To: xml-editor@w3.org
- Message-ID: <39A53D20.EC73EF24@psi.uni-trier.de>
Hello, please consider the following, which I found to be an error in the XML 1.0 Specification Errata, E44. If the following holds: "The notation ## is used to denote any byte value except 00." , then the following quoted lines from the Errata: > 00 3C ## ##, > 00 25 ## ##, > 00 20 ## ##, > 00 09 ## ##, > 00 0D ## ## or > 00 0A ## ##: Big-endian UTF-16 or ISO-10646-UCS-2. Note that, absent > an encoding declaration, these cases are strictly > speaking in error. > should be " 00 3C 00 ##, 00 25 00 ##, 00 20 00 ##, 00 09 00 ##, 00 0D 00 ## or 00 0A 00 ##: Big-endian UTF-16 or ISO-10646-UCS-2. Note that, absent an encoding declaration, these cases are strictly speaking in error." Otherwise a file beginning with the bytes 00 3C 00 3F, which accordingly to the original XML 1.0 Specification would correctly be interpreted as UTF-16, big-endian would fall now, accordingly to the Errata in the cathegory "other", i.e. would be wronlgy interpreted as UTF-8. Analogously the lines > 3C 00 ## ##, > 25 00 ## ##, > 20 00 ## ##, > 09 00 ## ##, > 0D 00 ## ## or > 0A 00 ## ##: Little-endian UTF-16 or ISO-10646-UCS-2. Note that, absent > an encoding declaration, these cases are strictly > speaking in error. > should be in my opinion " 3C 00 ## 00, 25 00 ## 00, 20 00 ## 00, 09 00 ## 00, 0D 00 ## 00 or 0A 00 ## 00: Little-endian UTF-16 or ISO-10646-UCS-2. Note that, absent an encoding declaration, these cases are strictly speaking in error." Is this right? Thank you. Regards, Alexandru Berlea.
Received on Thursday, 24 August 2000 11:20:45 UTC