Re: Advice on making IRI document suitable for reference by HTML (and other specs) from Erik van der Poel on 2010-01-01 (public-iri@w3.org from January 2010)

From: Erik van der Poel <erikv@google.com>
Date: Thu, 31 Dec 2009 16:51:15 -0800
To: "Phillips, Addison" <addison@amazon.com>
Cc: "Roy T. Fielding" <fielding@gbiv.com>, "public-iri@w3.org" <public-iri@w3.org>
Message-ID: <c07a32650912311651p3247626ey556d4670ca1b3fb3@mail.gmail.com>

Again, there isn't just one output. In addition to outputs like the
HTTP Request-URI, there are DOM interfaces like pathname.

URL processing can be divided into parsing and resolution. The DOM
interfaces can be used to access the output of the parsing phase,
including, in the case of the DOM href interface, the absolute URL
that was produced by resolving a relative URL against a base URL. It
appears that many of the major browsers return Unicode in the DOM
interfaces, even when the host was originally in Punycode (in the
HTML). How much of this should be in the HTML spec, and how much in
the DOM spec? This is also a "split", as Ian calls it.

The output of the resolution phase includes such things as the HTTP
Request-URI. The major HTML implementations all convert the ?query
part back to the original character encoding of the HTML before
placing it in the HTTP request. How much of this should be in the HTML
spec, and how much in the IRIbis spec? This is part of Ian's "split"
question.

I think HTML5 can come up with all of these spec pieces faster than
IRIbis can. Maybe I'm pessimistic, but I have seen how long these
things take in the IDNAbis work. One of the core disagreements there
was that one camp wanted all "pre-processing" to be performed in the
UI, while the other camp wanted HTML implementations to continue to
pre-process domain names in hrefs. I suspect that the same camps will
reappear in the IRIbis work, thereby delaying it.

Erik

Received on Friday, 1 January 2010 00:51:47 UTC