- From: Anne van Kesteren <annevk@annevk.nl>
- Date: Mon, 5 Nov 2012 16:20:44 +0100
- To: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Cc: David Sheets <kosmo.zb@gmail.com>, Ian Hickson <ian@hixie.ch>, "Manger, James H" <James.H.Manger@team.telstra.com>, Christophe Lauret <clauret@weborganic.com>, Jan Algermissen <jan.algermissen@nordsc.com>, Ted Hardie <ted.ietf@gmail.com>, URI <uri@w3.org>, "public-iri@w3.org" <public-iri@w3.org>
On Mon, Nov 5, 2012 at 12:19 PM, "Martin J. Dürst" <duerst@it.aoyama.ac.jp> wrote: > That's for U+FFF0 to U+FFFD. U+FFF0 to U+FFFC are characters that are > strictly reserved for internal processing, I think MS Word, among else, uses > these. A browser that wanted to use these to simplify internal > implementation would have trouble accepting them from the outside. Given the way strings in browsers are really 16-bit code units (Mozilla's Rust might change that, I hear) with no restrictions I doubt that's a problem. And given that the input to the URL parser can certainly contain one of those code points you have to handle them somehow. > Consistency across formats is definitely a good thing. But there are some > serious differences between text and identifiers. Something that's harmless > in text (e.g. a zero-width space) may be hopeless in an IRI/URL (because it > creates a different address, leading to confusion). Unicode has lots of space for confusion. I'll note that HTML defines an identifier too and it takes any code point except for ASCII whitespace: http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#the-id-attribute Incidentally for text/html, URL fragments can be used to refer to it... > Actually, the characters that I currently would like to exclude most (not > just in a spec, but actually in the browser implementations) are bidi > control characters. RFC 3987 disallows them, but not in the syntax. Moving > the restrictions to the syntax would give them more prominence. Allowing > them in IRIs/URLs is just a wide open door for scams and phishers. I don't really have an opinion on this. I can certainly assist filing bugs on implementors, but I doubt they are interested in taking this potential compatibility hit (if I understand correctly what you're proposing). -- http://annevankesteren.nl/
Received on Monday, 5 November 2012 15:21:17 UTC