W3C home > Mailing lists > Public > public-iri@w3.org > April 2009

Re: Generic URI syntax incompatible with IPv6 addresses in SIP URIs

From: Julian Reschke <julian.reschke@gmx.de>
Date: Wed, 01 Apr 2009 10:52:15 +0200
Message-ID: <49D32B3F.8080108@gmx.de>
To: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
CC: Lisa Dusseault <lisa.dusseault@gmail.com>, public-iri@w3.org, Mark Baker <mark@coactus.com>, apps-discuss@ietf.org
Martin J. Dürst wrote:
> I agree with Julian that discussion about updates of RFC 3987 (IRI spec,
> new version currently at draft-duerst-iri-bis-05) should go to
> public-iri@w3.org, whereas bug reports and discussions about RFC 3986
> (URI spec) should go to uri@w3.org.
> 
> But Lisa, you are the AD who eventually has to handle anything that 
> comes out of these drafts, so I'm glad to follow your directions.
> 
> Julian, could you be more specific (pointer is sufficient) about what 
> you mean below with "potential issue (normalization)"?
> 
> Regards,    Martin.

It's the one raised by Anne vK: Section 3.1 requires different IRI->URI 
behavior depending on the encoding of the source document:

    Step 1.  Generate a UCS character sequence from the original IRI
             format.  This step has the following three variants,
             depending on the form of the input:

             a. If the IRI is written on paper, read aloud, or otherwise
                represented as a sequence of characters independent of
                any character encoding, represent the IRI as a sequence
                of characters from the UCS normalized according to
                Normalization Form C (NFC, [UTR15]).

             b. If the IRI is in some digital representation (e.g., an
                octet stream) in some known non-Unicode character
                encoding, convert the IRI to a sequence of characters
                from the UCS normalized according to NFC.

             c. If the IRI is in a Unicode-based character encoding (for
                example, UTF-8 or UTF-16), do not normalize (see section
                5.3.2.2 for details).  Apply step 2 directly to the
                encoded Unicode character sequence.

It would be helpful to understand where exactly this requirement comes 
from, and whether we have evidence it's being implemented (or even 
implementable; the source document encoding may not be known at the 
moment where the URI->IRI conversion occurs).

BR, Julian
Received on Wednesday, 1 April 2009 08:53:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 October 2009 06:33:36 GMT