- From: Richard Ishida <ishida@w3.org>
- Date: Wed, 25 Aug 2010 18:57:03 +0100
- To: "'Gunnar Bittersmann'" <gunnar@bittersmann.de>
- Cc: <www-international@w3.org>
> From: www-international-request@w3.org [mailto:www-international- > request@w3.org] On Behalf Of Gunnar Bittersmann > Sent: 17 August 2010 11:19 > To: www-international@w3.org > Subject: Re: For review: 6 new and 2 updated articles about character > encoding > > Sorry for the cliffhangers. ;-) Some more proposals: > > http://www.w3.org/International/questions/qa-escapes.en.php#bytheway > > Typography: “ie. á could be represented as á” > > Use <span class="qchar">á</span> (displayed in bigger font, wrapped in > ') as before in the paragraph and in the beginning of the document. > > The same might apply to “single ampersand (&)” in the last paragraph. Done. > > *** > > http://www.w3.org/International/tutorials/tutorial-char-enc/#n11n > > “text in a script that uses accents or diacritics.” > > Accents are a kind of diacritic. Make it: text in a script that uses > accents or other diacritics. Done. > > *** > > http://www.w3.org/International/articles/definitions-characters/#unicode > > It could be mentioned that 65,536 = 2^16. Done. > > > http://www.w3.org/International/articles/definitions-characters/#charsets > > “(Note that hexadecimal notation is commonly used for referring to code > points, and will be used here.)” > > That’s fine. > > “For example, the letter A in the ISO 8859-1 coded character set is in > the 65th character position (starting from zero), and is encoded for > representation in the computer using a byte with the value of 65.” > > Oops, decimal. I think that's ok. I'm trying to make the link here, and the byte value is indeed 65. I'm not referring to a codepoint by name. > > > http://www.w3.org/International/articles/definitions-characters/#httpheader > > When you retrieve a document from a server, the server normally sends > some additional information with the document. This is called the HTTP > header. > > Fine. > > http://www.w3.org/International/articles/definitions-characters/#mimetypes This section has been significantly reworked, and I think the comments are now moot. > > “When a server serves (ie. sends) a document to a browser (or user agent)…” > > Browsers are a kind of user agents. Make it: browser (or other user agent) > > “…it also sends some additional information with the document, called > the HTTP header.” > > Is the duplication of content (see above) necessary in this place? > > > “HTML is an SGML-based markup language.” > > It could (should?) be mentioned here that HTML5 (in HTML serialization) > ist not SGML-based. > > > “that you leave a space before the '' at the end of an empty tag” > > '/' missing: that you leave a space before the '/' at the end of an > empty tag > > However, this recommendation ist outdated, no current browser has > problems with <foo/>. > > “that you always use both id and name attributes for fragment identifiers” > > Outdated. > > *** > > http://www.w3.org/International/questions/qa-chars-vs-markup#ok > > “This is not an exhaustive list.” Fine. Is “etc.” worth a table row, then? Removed. > > http://www.w3.org/International/questions/qa-chars-vs-markup#compat > > In the next table, it is “Etc…” > > Make it the same in both tables, or remove it. Removed. > > > “Superscripted and subscripted characters | ¹ ² ³ ₁ ₂ ₃ | use <sup> or > <sub> markup” > > I tend to disagree here. The superscripted and subscripted characters > carry information (x² is something different than x₂) that might get > lost when <sup> or <sub> markup is used and text is copied without > markup from a webpage (x<sup>2</sup> and x<sub>2</sub> both > become x2; > 4<sup>2</sup> becomes 42). > > And there is a typography/readability issue: The superscripted and > subscripted characters should be readable at reasonable font sizes, > whereas scaled-down characters (e.g. sup, sub { font-size: 0.25em }) > might not be readable and might not fit typographically. This is an issue that needs to be raised against the Unicode in XML document. > > *** > > http://www.w3.org/International/questions/qa-byte-order-mark#bomwhat > > As pointed out, UTF-32 ist out of the game and not mentioned in “When a > character is encoded in UTF-16, its 2 or 4 bytes can be ordered in two > different ways ('little-endian' or 'big-endian').” > > Since it’s all about UTF-16, it is confusing why UTF-16 is mentioned in > the next sentence “The picture below illustrates this for UTF-16.” > > Make it: The picture below illustrates this. Done. Thanks. RI
Received on Wednesday, 25 August 2010 17:57:37 UTC