- From: Noah Mendelsohn <nrm@arcanedomain.com>
- Date: Fri, 04 Nov 2011 00:52:36 -0400
- To: "www-tag@w3.org" <www-tag@w3.org>
- CC: Peter Saint-Andre <stpeter@stpeter.im>
This e-mail from Peter Saint-Andre to the public-iri mailing list may be of interest to the TAG. Noah -------- Original Message -------- Subject: URL parsing in HTML5 Resent-Date: Fri, 04 Nov 2011 04:22:35 +0000 Resent-From: public-iri@w3.org Date: Thu, 03 Nov 2011 21:21:50 -0700 From: Peter Saint-Andre <stpeter@stpeter.im> To: public-iri@w3.org <public-iri@w3.org>, public-html-comments@w3.org CC: Sam Ruby <rubys@intertwingly.net>, "Paul Cotton (pcotton@microsoft.com)" <pcotton@microsoft.com>, Ian Hickson <ian@hixie.ch>, "Michael(tm) Smith" <mike@w3.org>, Adam Barth <ietf@adambarth.com>, Edward O'Connor <ted@oconnor.cx> After chatting during TPAC 2011 with Addison, Larry, Richard, Ian, Mike, Ted, Julian (etc.), I'd like to share some thoughts about a possible compromise / resolution regarding Issue 56 in the HTML WG: http://www.w3.org/html/wg/tracker/issues/56 Some observations and opinions: 1. It is unlikely that existing browsers will change their current URL parsing behavior. (I am not judging whether that behavior is good or bad.) 2. Documentation of that behavior is out of scope for the revisions to RFC 3987, and outside the charter of the IRI WG, because it's a matter of URI [pre-]processing (RFC 3986) and not IRI processing (RFC 3987). 3. It is unlikely that RFC 3986 will ever be modified to recommend the current behavior, and simply impossible before HTML5 is advanced at the W3C (even if such modifications were desirable). 4. As far as I can see, the current behavior is in fact out of scope for RFC 3986 and any future possible revisions to RFC 3986 because: (a) it is mostly or completely a matter of pre-processing of strings that look like URIs/URLs/"web-addresses" -- we could call these "candidate strings" or "proto-URLs" or somesuch to disambiguate them from URIs (b) this pre-processing behavior is applied only in the web context by browsers and software applications that want to be consistent with browsers (c) because of (b), there is no great danger that this behavior will "leak" into processing of URIs in general (mailto:, sip:, tel:, URNs, and so on) 5. There's no necessity for work on documentation of the current URL parsing behavior to happen at the IETF, given that it's out of scope for the IRI WG. Although this work could be done as an individual (non-WG) I-D at the IETF, I think it could more easily be done at the W3C, either as part of the HTML specification or as a separate document (the latter might be preferable so that it can be reviewed in a more focused manner and referenced more easily by other W3C specifications, but naturally I would leave such decisions up to folks at the W3C). [The IRI WG is still responsible for rfc3987bis, but that's off-topic for this email message.] If folks can agree on the foregoing points, then I think it would be productive to work on proposed revisions to the current text (or at least what I believe is the current text): http://www.w3.org/TR/html5/Overview.html#parsing-urls I would be happy to make concrete suggestions during that revision process if someone from the W3C could point to the preferred venue or process (e.g., wiki page or bugzilla comments). I look forward to discussing this further tomorrow morning during the HTML WG session: http://lists.w3.org/Archives/Public/public-html/2011Nov/0013.html Peter -- Peter Saint-Andre https://stpeter.im/
Received on Friday, 4 November 2011 04:53:04 UTC