- From: Maciej Stachowiak <mjs@apple.com>
- Date: Fri, 09 Apr 2010 01:13:15 -0700
- To: ""Martin J. Dürst"" <duerst@it.aoyama.ac.jp>
- Cc: Ian Hickson <ian@hixie.ch>, Ted Hardie <ted.ietf@gmail.com>, Larry Masinter <LMM@acm.org>, Julian Reschke <julian.reschke@gmx.de>, Marc Blanchet <Marc.Blanchet@viagenie.ca>, Sam Ruby <rubys@intertwingly.net>, Paul Cotton <Paul.Cotton@microsoft.com>, Michel SUIGNARD <Michel@suignard.com>, public-html <public-html@w3.org>, "public-iri@w3.org" <public-iri@w3.org>
On Apr 9, 2010, at 1:00 AM, Martin J. Dürst wrote: > Hello Ian, > > Many thanks for your very careful description of the issues below. I > propose (to the IRI WG chairs) that we replace the current issue 1 > in our tracker with these two issues. > > More comments below. I tried to answer your questions to the best of my abilities. > > On 2010/04/09 10:40, Ian Hickson wrote: >> On Thu, Apr 8, 2010 at 9:31 AM, Ted Hardie<ted.ietf@gmail.com> >> wrote: >>> >>> my understanding is that the correct next step will be to describe >>> this issue >>> in a way that we can track. >> >> I've tried to write descriptions of the two issues. Please let me >> know >> if you need any further advice on the matter. >> >> Issue 1: >> = >> = >> = >> ===================================================================== >> Update the IRI specification to define an algorithm with the >> following >> characteristics: > > In order to make it easier to understand this for people who are not > deeply involved in the HTML5 effort, I'd like to confirm that this > is the algorithm that HTML5 uses to split an URI/IRI into various > components, each of which is then accessible via a (Javascript) DOM > API function. So I guess the title of our issue should be something > like: > "Ensure that the IRI spec defines how to split an IRI into > components in a way that's referencable by the HTML5 spec" or some > such. That's the primary purpose, yes. > >> Input: >> * a string >> >> Output: >> * a boolean representing whether the algorithm succeeded or >> failed >> * if the algorithm succeeded, one or more strings corresponding >> to >> the following components, each of which may be present or >> absent: >> -<scheme> component >> -<host> component >> -<port> component >> -<hostport> component >> -<path> component >> -<query> component >> -<fragment> component >> -<host-specific> component >> >> This algorithm must be such that it can be used where HTML5 says "the >> user agent must use the parse an address algorithm defined by the IRI >> specification" in a manner that user agents including major browser >> vendors will be willing to implement the algorithm as written. >> >> Exactly what this algorithm must do is a matter that will need >> careful >> research, reverse-engineering existing UAs. > > My understanding was that a lot of this research had already been > done, and that we would basically try to match whatever was in the > HTML5 spec before Dan Connolly and Michael Sperberg-McQueen > extracted it into a separate draft. Of course, we should always be > open to new information coming up, but your sentence above sounds > much more like we have to start anew. Can you clarify? I think the old contents of HTML5, or even the content of the now abandoned Web Addresses spec, would be a good starting point. However, I believe that both Web Addresses and the old spec have bugs. New testing would be advisable to confirm some of the details and check edge cases. > >> The algorithm needs to be defined in such a way that it can be >> referenced unambiguously by name. For example, text such as the >> following could be used to introduce this algorithm: >> >> When a specification says that a user agent is to *parse an >> address*, given a string INPUT, it must run the following steps, >> which return a failure/success condition and a set of components: >> >> ... >> >> This gives a completely unambiguous and clear way to invoke the >> algorithm described in the spec, along with RFC2119-level clarity >> regarding what such invokations imply for the user agent. >> = >> = >> = >> ===================================================================== >> >> Issue 2: >> = >> = >> = >> ===================================================================== >> Update the IRI specification to define an algorithm with the >> following >> characteristics: > > Again to clarify here, if I understand correctly, the HTML5 spec > needs such an algorithm to resolve relative references with respect > to a base URI (my wild guess is that B is the base, and A is the > relative URI below, can you confirm)? Specifically, String A is a possibly-relative URI (really a possibly- relative IRI reference with lenient Web Address processing), and String B is an absolute URI that is the base. String A is resolved against String B as a base, though if String A happens to be absolute, then A itself will be returned. > > Regards, Martin. > > >> Input: >> * a string A >> * a string B, which was previously output from this algorithm >> * a character encoding >> >> Output: >> * a boolean representing whether the algorithm succeeded or >> failed >> * if the algorithm succeeded, a string >> >> This algorithm must be such that it can be used where HTML5 says "the >> result of applying the resolve an address algorithm defined by the >> IRI >> specification to resolve url relative to base using encoding >> encoding" >> in a manner that user agents including major browser vendors will be >> willing to implement the algorithm as written. >> >> Exactly what this algorithm must do is a matter that will need >> careful >> research, reverse-engineering existing UAs. >> >> The algorithm needs to be defined in such a way that it can be >> referenced unambiguously by name. For example, text such as the >> following could be used to introduce this algorithm: >> >> When a specification says that a user agent is to *resolve an >> address", given a string INPUT, a second string BASE, and a >> character encoding ENCODING, it must run the following steps, >> which >> return a failure/success condition and a string: >> >> ..." >> >> This gives a completely unambiguous and clear way to invoke the >> algorithm described in the spec, along with RFC2119-level clarity >> regarding what such invokations imply for the user agent. >> = >> = >> = >> ===================================================================== > > > Regards, Martin. > > -- > #-# Martin J. Dürst, Professor, Aoyama Gakuin University > #-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp >
Received on Friday, 9 April 2010 08:13:51 UTC