- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Fri, 21 May 2004 05:10:23 +0200
- To: www-i18n-comments@w3.org
Hi, [1] http://www.w3.org/TR/i18n-html-tech-char/ [2] http://www.w3.org/International/tutorials/tutorial-char-enc.html Could either or both please have some basic discussion and illustration of what a character encoding actually is? This is something difficult to teach as many people haven't ever got in touch with binary data, they use their text editor for "text" documents and most of the time it works just fine. That's something such documents should break at the very beginning; this is binary data, as in 100101010010101011010101010001... Something with an image or images, here is a poor example http://lists.w3.org/Archives/Public/www-archive/2004May/att-0050/encoding.png Basically all [2] says about this is, relatively late in the document ... The character encoding reflects the way these abstract characters are mapped to bytes for manipulation in a computer. ... And [1] contains more or less nothing that would help to understand what's going on behind the scenes of the software readers use every day. Catch reader by logic. In my example, if the charset=utf-8 parameter is missing, how is a browser supposed to know how to turn 100101001... into characters? That does not work. That's what readers need to understand. regards.
Received on Thursday, 20 May 2004 23:10:43 UTC