- From: David Sheets <kosmo.zb@gmail.com>
- Date: Tue, 25 Sep 2012 11:20:15 -0700
- To: Anne van Kesteren <annevk@annevk.nl>
- Cc: whatwg <whatwg@whatwg.org>, Ian Hickson <ian@hixie.ch>
On Tue, Sep 25, 2012 at 8:03 AM, Anne van Kesteren <annevk@annevk.nl> wrote: > On Tue, Sep 25, 2012 at 6:18 AM, Ian Hickson <ian@hixie.ch> wrote: >> Not necessarily, but that's certainly possible. Personally I would >> recommend that we not change the definition of what is conforming from the >> current RFC3986/RFC3987 rules, except to the extent that the character >> encoding affects it (as per the HTML standard today). >> >> http://whatwg.org/html#valid-url > > FWIW, given that browsers happily do requests to servers with > characters in the URL that are "invalid" per the RFC (they are not URL > escaped) and servers handle them fine I think we should make the > syntax more lenient. E.g. allowing [ and ] in the path and query > component is fine I think. I believe this would introduce ambiguity for parsing URI references. Is "[::1]" an authority reference or a path segment reference? > As for the question about why not build this on top of RFC 3986. That > does not handle non-ASCII code points. RFC 3987 does, but is not a > suitable start either. As shown in http://url.spec.whatwg.org/ it is > quite trivial to combine parsing, resolving, and canonicalizing into a > single algorithm (and deal with URI/IRI, now URL, as one). Composition is often trivial but unenlightening. There is necessarily less information in a partially evaluated function composition than in the functions in isolation. Defining a formal language accurately and in a broadly understandable manner is nontrivial. Your task is nontrivial. > Trying to > somehow patch the language in RFC 3987 to deal with the encoding > problems for the query component, to deal with parsing > http:example.org when there is a base URL with the same scheme versus > when there isn't, etc. is way more of a hassle I think, though I am > happy to be proven wrong. I believe the encoding problems are handled by a normalization algorithm and parsing relative references is handled by the base scheme module. What is the acceptable trade-off between (y)our hassle and the time of technologists in the coming decades? Will you make it easier or harder for them to reconcile WHATWG-URL and Internet Standard 66 (RFC 3986)? > -- > http://annevankesteren.nl/
Received on Tuesday, 25 September 2012 18:20:47 UTC