- From: Steven Pemberton <steven.pemberton@cwi.nl>
- Date: Wed, 3 Jul 2002 21:50 +0900
- To: www-i18n-comments@w3.org
- Cc: steven.pemberton@cwi.nl (Steven Pemberton)
This is a last call comment from Steven Pemberton (steven.pemberton@cwi.nl) on
the Character Model for the World Wide Web 1.0
(http://www.w3.org/TR/2002/WD-charmod-20020430/).
Semi-structured version of the comment:
Submitted by: Steven Pemberton (steven.pemberton@cwi.nl)
Submitted on behalf of (maybe empty): HTML WG
Comment type: editorial
Chapter/section the comment applies to: 3.3 Transcoding
The comment will be visible to: public
Comment title: Give example of transcoding
Comment:
It would be useful if 3.3 gave an example of where transcoding is used, since this is a frequently misunderstood point with regards to XML and HTML. People (and some UAs) think that the encoding also specifies the repertoire/CCS.
Something along the lines of:
"For example, in XML and HTML, documents are always in Unicode, but they may be delivered to a user agent in an encoding for another coded character set (indicated by the encoding attribute in XML, and the HTTP content-type header in HTML). The user agent then transcodes the characters of the incoming document stream into Unicode code points. For example, a document delivered with encoding iso-8859-2 may contain the string "&0x0151;&0x0151;" where the first character (LATIN SMALL LETTER O WITH DOUBLE ACUTE) is at code point 0xf5 in iso-8859-2. This will be transcoded so that there will be two identical characters at code point 0x0151 in the document as processed by the user agent."
Note: "&0x0151;&0x0151;" should look like "o&0x0151;" with a double acute on the o; i.e. an actual character followed by a NCR. Feel free to substitute any similar character if you wish.
Structured version of the comment:
<lc-comment
visibility="public" status="pending"
decision="pending" impact="editorial">
<originator email="steven.pemberton@cwi.nl" represents="HTML WG"
>Steven Pemberton</originator>
<charmod-section href='http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-Transcoding'
>3.3</charmod-section>
<title>Give example of transcoding</title>
<description>
<comment>
<dated-link date="2002-07-03"
>Give example of transcoding</dated-link>
<para>It would be useful if 3.3 gave an example of where transcoding is used, since this is a frequently misunderstood point with regards to XML and HTML. People (and some UAs) think that the encoding also specifies the repertoire/CCS.
Something along the lines of:
"For example, in XML and HTML, documents are always in Unicode, but they may be delivered to a user agent in an encoding for another coded character set (indicated by the encoding attribute in XML, and the HTTP content-type header in HTML). The user agent then transcodes the characters of the incoming document stream into Unicode code points. For example, a document delivered with encoding iso-8859-2 may contain the string "&0x0151;&0x0151;" where the first character (LATIN SMALL LETTER O WITH DOUBLE ACUTE) is at code point 0xf5 in iso-8859-2. This will be transcoded so that there will be two identical characters at code point 0x0151 in the document as processed by the user agent."
Note: "&0x0151;&0x0151;" should look like "o&0x0151;" with a double acute on the o; i.e. an actual character followed by a NCR. Feel free to substitute any similar character if you wish.
</para>
</comment>
</description>
</lc-comment>
Received on Wednesday, 3 July 2002 08:50:35 UTC