- From: Misha Wolf <misha.wolf@reuters.com>
- Date: Wed, 08 Nov 2000 18:37:50 +0000 (GMT)
- To: xml-editor@w3.org
- Cc: w3c-i18n-ig@w3.org
Extensible Markup Language (XML) 1.0 (Second Edition), in:
4.2.2 External Entities
http://www.w3.org/TR/REC-xml#sec-external-ent
states:
| URI references require encoding and escaping of certain characters. The
| disallowed characters include all non-ASCII characters, plus the
| excluded characters listed in Section 2.4 of [IETF RFC 2396], except for
| the number sign (#) and percent sign (%) characters and the square
| bracket characters re-allowed in [IETF RFC 2732]. Disallowed characters
| must be escaped as follows:
|
| Each disallowed character is converted to UTF-8 [IETF RFC 2279] as one
| or more bytes.
|
| Any octets corresponding to a disallowed character are escaped with the
| URI escaping mechanism (that is, converted to %HH, where HH is the
| hexadecimal notation of the byte value).
|
| The original character is replaced by the resulting character sequence.
We seem to have two bytes and one octet. Please can we standardise
on one term or the other.
Misha
[This mail was written using voice recognition software]
-----------------------------------------------------------------
Visit our Internet site at http://www.reuters.com
Any views expressed in this message are those of the individual
sender, except where the sender specifically states them to be
the views of Reuters Ltd.
Received on Wednesday, 8 November 2000 13:38:17 UTC