- From: Shawn Steele <Shawn.Steele@microsoft.com>
- Date: Wed, 6 Jan 2010 00:24:30 +0000
- To: Chris Weber <chris@casabasecurity.com>, "'Phillips, Addison'" <addison@amazon.com>, "public-iri@w3.org" <public-iri@w3.org>
I'm not sure how interesting escaping is to prevent "visual spoofing". Most users won't distinguish between %xx and %yy if they saw them, but I don't think they'd even see them because the browser or whatever would likely display the IRI in a friendly form anyway, with unescaped spaces. -Shawn -----Original Message----- From: public-iri-request@w3.org [mailto:public-iri-request@w3.org] On Behalf Of Chris Weber Sent: , 05, 2010 11:39 To: 'Phillips, Addison'; public-iri@w3.org Subject: RE: space characters: should we require escapes on all of them? If the goal was to mitigate visual spoofing potential, then escaping Zs category characters would seem a good start. But would you stop there? Special characters such as the BOM U+FEFF, which has no direct mention I found in draft-07, could be used to exploit zero-width spacing, as could the joiners and other characters you're all probably familiar with. Combining marks could also be stacked in clever ways to make for invisible attacks. On this subject, is this a bug in the spec section "7.3. Characters not allowed in IRIs" where it says: Specials (U+FFF0-FFFD): These code points provide functionality beyond that useful in an IRI, for example byte order identification, annotation, and replacements for unknown characters and objects. Their use and interpretation in an IRI would serve no purpose and might lead to confusing display variations. When it refers to "byte order identification" did it mean to include U+FEFF in the range? Chris Weber Security Research Casaba Security -----Original Message----- From: public-iri-request@w3.org [mailto:public-iri-request@w3.org] On Behalf Of Phillips, Addison Sent: Monday, January 04, 2010 4:23 PM To: public-iri@w3.org Subject: space characters: should we require escapes on all of them? Allowing (or not) a space character in a web address was mentioned recently in the thread on HTML5, and I got to thinking: Unicode also includes other non-control whitespace characters and these don't appear to be dealt with anywhere, including the security section of draft-07. I like that IRIs do not have spaces in them. An IRI is an identifier and should not be regarded as a repository for prose. But, since the space character must be escaped, I think perhaps that the other Unicode whitespace characters (category Zs) should be treated similarly and would suggest adding a prohibition on them to section 7.3 in draft-07. This would help defend against visual spoofing such as using an "em space" (U+2002) to make a single IRI look like two adjacent IRIs. If we don't prohibit these characters, maybe there should at least be a note in the security section mentioning them for exactly that reason. Addison Addison Phillips Globalization Architect -- Lab126 Internationalization is not a feature. It is an architecture.
Received on Wednesday, 6 January 2010 00:25:07 UTC