RE: reviewing draft-weber-iri-guidelines-00 from Phillips, Addison on 2011-07-11 (public-iri@w3.org from July 2011)

From: Phillips, Addison <addison@lab126.com>
Date: Mon, 11 Jul 2011 16:35:39 -0700
To: Chris Weber <chris@lookout.net>, "public-iri@w3.org" <public-iri@w3.org>
Message-ID: <131F80DEA635F044946897AFDA9AC3476A9439F508@EX-SEA31-D.ant.amazon.com>

> 
> I did mean numerical character references, and not HTML named entity
> references.
> 
> Should all numerical character references be replaced with their corresponding
> character?  e.g. "&#x0041;" would become "A".
> 
> For example, as the following:
> 
> htt&#x0070;://www.example.com/&#x0066;&#x006f;&#x006f;bar&#x002f;foo
> &#x003f;bar
> 
> When prepared for parsing would become:
> 
> http://www.example.com/foobar/foo?bar

> 
> Should this step of unescaping be limited to the &lt;iunreserved&gt; set?
> 

I don't see how it can be. Note that HTML processors should be removing the escapes at a higher level of processing.

Just curious, why don't you mean unescaping of %-encoded values? I would think that one goal in processing an IRI would be to arrive at the "canonical" IRI.

Addison

Received on Monday, 11 July 2011 23:36:04 UTC