Re: Some issues with the IRI document [e9notutf8-05]

At 08:14 03/04/08 -0700, Paul Hoffman / IMC wrote:

>Clarifications:

>The last paragraph of 1.2 is confusing in the middle where you talk about 
>UTF-8. 0xE9 is not the representation of a UTF-8 character. Even though 
>the example is wrong, it got me stuck in UTF-8 mode, which helped get me 
>stuck in thinking that you were talking sometimes about the encoding.

Thanks for spotting this mistake. This issue is listed as
http://www.w3.org/International/iri-edit/Overview.html#e9notutf8-05

The text in that paragraph read

    For example, for a document with a URI of
    http://www.example.org/r%C3%A9sum%C3%A9.html, it is possible to
    construct a corresponding IRI (in XML notation, see Section 1.4):
    http://www.example.org/résumé.html (é stands for the
    e-acute character, and is the UTF-8 encoded and escaped
    representation of that character).  On the other hand, for a document
    with an URI of http://www.example.org/r%E9sum%E9.html, the escaped
    octets cannot be converted to actual characters in an IRI, because
    the escaping is based on iso-8859-1 rather than UTF-8.

The text in parentheses should have read:

    (é stands for the e-acute character, and %C3%A9 is the UTF-8
    encoded and escaped representation of that character)

I have fixed that in my internal copy. Do you think that this change
helps you to understand the paragraph better?

Regards,   Martin.

Received on Tuesday, 15 April 2003 17:48:09 UTC