- From: <bugzilla@wiggum.w3.org>
- Date: Mon, 29 Jun 2009 07:19:59 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=6858 --- Comment #4 from Martin Dürst <duerst@it.aoyama.ac.jp> 2009-06-29 07:19:59 --- Looking at http://dev.w3.org/html5/spec/Overview.html#ascii-compatible-character-encoding: This solves the problem, but is needlessly complex. Instead of An ASCII-compatible character encoding is a single-byte or variable-length encoding in which the bytes 0x09, 0x0A, 0x0C, 0x0D, 0x20 - 0x22, 0x26, 0x27, 0x2C - 0x3F, 0x41 - 0x5A, and 0x61 - 0x7A, ignoring bytes that are the second and later bytes of multibyte sequences, all correspond to single-byte sequences that map to the same Unicode characters as those bytes in ANSI_X3.4-1968 (US-ASCII). the following would say the same but would be simpler: An ASCII-compatible character encoding is a character encoding in which the Unicode characters that have bytes values 0x09, 0x0A, 0x0C, 0x0D, 0x20 - 0x22, 0x26, 0x27, 0x2C - 0x3F, 0x41 - 0x5A, and 0x61 - 0x7A in ANSI_X3.4-1968 (US-ASCII, [RFC1345]) are represented by exactly and only the same byte values. The note after that is also a good start, but also needs some more work. Shift_JIS is used on every Japanese PC and Mac, so I wouldn't call this an exotic encoding. On the other hand, I didn't find a *submitted* draft for UTF-8+names, so whathever you think about it, it's clearly a dead end at this point of time. So I would reword: Note: This includes such exotic encodings as Shift_JIS and variants of ISO-2022, even though it is possible for bytes like 0x70 to be part of longer sequences that are unrelated to their interpretation as ASCII. It excludes such encodings as UTF-7, UTF-8+names, UTF-16, HZ-GB-2312, GSM03.38, and EBCDIC variants. to something like: Note: This includes encodings such as Shift_JIS and variants of ISO-2022, where it is possible for bytes like 0x70 to appear as part of multibyte sequences that are unrelated to their interpretation as ASCII. It excludes encodings such as UTF-7, UTF-16, HZ-GB-2312, GSM03.38, and EBCDIC variants. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Monday, 29 June 2009 07:20:09 UTC