RE: ACTION-574: proposal on prefix rewriting from Pratik Datta on 2010-06-22 (public-xmlsec@w3.org from June 2010)

From: Pratik Datta <pratik.datta@oracle.com>
Date: Tue, 22 Jun 2010 11:45:42 -0700 (PDT)
To: Scott Cantor <cantor.2@osu.edu>, XMLSec WG Public List <public-xmlsec@w3.org>
Message-ID: <0d19e1a3-c45d-4b92-911d-5900ac2f3fe0@default>

Do you think we should to UTF-8 encoding instead?  

I tried reading the IRI spec and from what I understand it says that all non ascii characters should be converted to ascii using the %dd notation.  In that case US-ASCII should be fine, shouldn't it ?

Pratik

-----Original Message-----
From: Scott Cantor [mailto:cantor.2@osu.edu] 
Sent: Tuesday, June 22, 2010 10:20 AM
To: Pratik Datta; XMLSec WG Public List
Subject: RE: ACTION-574: proposal on prefix rewriting

> I realized that a URI is a sequence of characters, it can't be digested
> unless it is converted to bytes. For this I am proposing that we use US-
> ASCII encoding, because URI are limited to US-ASCII characters aren't they
?

No, they're not, although it tends to be good practice in namespace URIs to
avoid pushing it and using IRIs or appending crazy path info. Your proposal
seems to be to force them to be normalized into US-ASCII by URL encoding any
character points that aren't. We should check on that to make sure that's
sufficiently well-defined, but it sounds reasonable.

-- Scott

Received on Tuesday, 22 June 2010 18:47:07 UTC