W3C home > Mailing lists > Public > uri@w3.org > August 2013

Re: Standardizing on IDNA 2003 in the URL Standard

From: Gervase Markham <gerv@mozilla.org>
Date: Thu, 22 Aug 2013 14:05:34 +0100
Message-ID: <52160C9E.20702@mozilla.org>
To: Anne van Kesteren <annevk@annevk.nl>
CC: Mark Davis ☕ <mark@macchiato.com>, Shawn Steele <Shawn.Steele@microsoft.com>, IDNA update work <idna-update@alvestrand.no>, "PUBLIC-IRI@W3.ORG" <public-iri@w3.org>, "uri@w3.org" <uri@w3.org>, John C Klensin <klensin@jck.com>, Peter Saint-Andre <stpeter@stpeter.im>, Marcos Sanz <sanz@denic.de>, Vint Cerf <vint@google.com>, "www-tag.w3.org" <www-tag@w3.org>
On 22/08/13 13:36, Anne van Kesteren wrote:
> As far as UseSTD3ASCIIRules is concerned, I haven't checked if TR46 is
> safe when it comes to
> https://www.w3.org/Bugs/Public/show_bug.cgi?id=23009 if you turn that
> flag off.

AIUI, assuming we write our replacement for the STD3ASCIIRules to
disallow "/" in hostnames, we should be fine. When UseSTD3ASCIIRules is
false, "℁" (U+2101) will map to "a/s", and then the "/" will be disallowed.

TR46 section 4.1:

"If UseSTD3ASCIIRules=false, then the validity tests for ASCII
characters are not provided by the table status values, but are
implementation-dependent. For example, if an implementation allows the
characters [\u002Da-zA-Z0-9] and also the underbar (_), then it needs to
use the table values for UseSTD3ASCIIRules=false, and test for any other
ASCII characters as part of its validity criteria. *These ASCII
characters may have resulted from a mapping*: for example, a U+005F ( _
) LOW LINE (underbar) may have originally been a U+FF3F ( _ ) FULLWIDTH
LOW LINE."

(Emphasis mine.)

Gerv
Received on Thursday, 22 August 2013 13:06:11 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:16 UTC