- From: <w3t-archive+esw-wiki@w3.org>
- Date: Wed, 20 Jul 2005 11:47:11 -0000
- To: w3t-archive+esw-wiki@w3.org
Dear Wiki user, You have subscribed to a wiki page or wiki category on "ESW Wiki" for change notification. The following page has been changed by Deborah Cawkwell: http://esw.w3.org/topic/geoUnicodeConsiderationsWhenUpgrading The comment on the change is: Page weight & RFC ------------------------------------------------------------------------------ Slightly heavier, but given that a large proportion of a web page is HTML mark-up, where characters remain 1 byte, then this is quite negligible. + '''QUERY FOR (DRC): did you mean the following RFC? [http://www.ietf.org/rfc/rfc3629.txt?number=3629 RFC: UTF-8, a transformation format of ISO 10646] I couldn't find the useful bit you mentioned re weight. Could you point me to it. ALSO QUERY FOR ALL - should we point to a page weight tool here.''' + * Latin languages: characters, eg, e acute, outside the ASCII range (128 codepoints), are represented by one byte in ISO-8859-1, but typically two bytes in UTF-8, so a small, but acceptable, increase in page size should be expected. * Characters that do not fall into the ASCII range, such as Chinese, Arabic, Russian, may use 2 or even 3 bytes. Chinese encodings already use more than 1 byte per character with legacy encodings, where they use double bytes. @@ -120, +122 @@ * [http://www.unicode.org Unicode Consortium] * [http://www.w3.org/International/tutorials/tutorial-char-enc Tutorial: Character sets & encodings in XHTML, HTML and CSS] * [http://www.w3.org/International/questions/qa-doc-charset Document Character Set for HTML and XML ] + * [http://www.ietf.org/rfc/rfc3629.txt?number=3629 RFC: UTF-8, a transformation format of ISO 10646] * [http://www.alanwood.net/unicode/browsers.html Unicode & multilingual web browsers] * [http://en.wikipedia.org/wiki/Unicode_and_HTML Unicode & HTML]
Received on Wednesday, 20 July 2005 14:32:42 UTC