RE: space characters: should we require escapes on all of them?

No, but they'd better distinguish this:

http://aa.goodguy.com%ee%80%80.blackhat.net


than something like:

http://aa.goodguy.com .blackhat.net

Addison

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.

> -----Original Message-----
> From: Shawn Steele [mailto:Shawn.Steele@microsoft.com]
> Sent: Tuesday, January 05, 2010 4:25 PM
> To: Chris Weber; Phillips, Addison; public-iri@w3.org
> Subject: RE: space characters: should we require escapes on all of them?
> 
> I'm not sure how interesting escaping is to prevent "visual spoofing".
> Most users won't distinguish between %xx and %yy if they saw them, but
> I don't think they'd even see them because the browser or whatever
> would likely display the IRI in a friendly form anyway, with unescaped
> spaces.
> 
> -Shawn
> 
> -----Original Message-----
> From: public-iri-request@w3.org [mailto:public-iri-request@w3.org] On
> Behalf Of Chris Weber
> Sent: ,  05,  2010 11:39
> To: 'Phillips, Addison'; public-iri@w3.org
> Subject: RE: space characters: should we require escapes on all of them?
> 
> If the goal was to mitigate visual spoofing potential, then escaping Zs
> category characters would seem a good start.  But would you stop there?
> Special characters such as the BOM U+FEFF, which has no direct mention
> I found in draft-07, could be used to exploit zero-width spacing, as
> could the joiners and other characters you're all probably familiar
> with. Combining marks could also be stacked in clever ways to make for
> invisible attacks.
> 
> On this subject, is this a bug in the spec section "7.3.  Characters
> not allowed in IRIs" where it says:
> 
>       Specials (U+FFF0-FFFD): These code points provide functionality
>       beyond that useful in an IRI, for example byte order
>       identification, annotation, and replacements for unknown
>       characters and objects.  Their use and interpretation in an IRI
>       would serve no purpose and might lead to confusing display
>       variations.
> 
> When it refers to "byte order identification" did it mean to include
> U+FEFF in the range?
> 
> 
> Chris Weber
> Security Research
> Casaba Security
> 
> 
> 
> 
> -----Original Message-----
> From: public-iri-request@w3.org [mailto:public-iri-request@w3.org] On
> Behalf Of Phillips, Addison
> Sent: Monday, January 04, 2010 4:23 PM
> To: public-iri@w3.org
> Subject: space characters: should we require escapes on all of them?
> 
> Allowing (or not) a space character in a web address was mentioned
> recently in the thread on HTML5, and I got to thinking: Unicode also
> includes other non-control whitespace characters and these don't appear
> to be dealt with anywhere, including the security section of draft-07.
> 
> I like that IRIs do not have spaces in them. An IRI is an identifier
> and should not be regarded as a repository for prose. But, since the
> space character must be escaped, I think perhaps that the other Unicode
> whitespace characters (category Zs) should be treated similarly and
> would suggest adding a prohibition on them to section 7.3 in draft-07.
> This would help defend against visual spoofing such as using an "em
> space" (U+2002) to make a single IRI look like two adjacent IRIs.
> 
> If we don't prohibit these characters, maybe there should at least be a
> note in the security section mentioning them for exactly that reason.
> 
> Addison
> 
> Addison Phillips
> Globalization Architect -- Lab126
> 
> Internationalization is not a feature.
> It is an architecture.
> 
> 
> 
> 

Received on Wednesday, 6 January 2010 01:30:44 UTC