Re: Some issues with the IRI document [e9notutf8-05] from Martin Duerst on 2003-04-15 (public-iri@w3.org from April 2003)

From: Martin Duerst <duerst@w3.org>
Date: Tue, 15 Apr 2003 17:45:54 -0400
To: Paul Hoffman / IMC <phoffman@imc.org>, public-iri@w3.org
Message-Id: <4.2.0.58.J.20030415172920.02cdba60@localhost>

At 08:14 03/04/08 -0700, Paul Hoffman / IMC wrote:

>Clarifications:

>The last paragraph of 1.2 is confusing in the middle where you talk about 
>UTF-8. 0xE9 is not the representation of a UTF-8 character. Even though 
>the example is wrong, it got me stuck in UTF-8 mode, which helped get me 
>stuck in thinking that you were talking sometimes about the encoding.

Thanks for spotting this mistake. This issue is listed as
http://www.w3.org/International/iri-edit/Overview.html#e9notutf8-05

The text in that paragraph read

    For example, for a document with a URI of
    http://www.example.org/r%C3%A9sum%C3%A9.html, it is possible to
    construct a corresponding IRI (in XML notation, see Section 1.4):
    http://www.example.org/r&#xe9;sum&#xe9;.html (&#xe9; stands for the
    e-acute character, and is the UTF-8 encoded and escaped
    representation of that character).  On the other hand, for a document
    with an URI of http://www.example.org/r%E9sum%E9.html, the escaped
    octets cannot be converted to actual characters in an IRI, because
    the escaping is based on iso-8859-1 rather than UTF-8.

The text in parentheses should have read:

    (&#xe9; stands for the e-acute character, and %C3%A9 is the UTF-8
    encoded and escaped representation of that character)

I have fixed that in my internal copy. Do you think that this change
helps you to understand the paragraph better?

Regards,   Martin.

Received on Tuesday, 15 April 2003 17:48:09 UTC