W3C home > Mailing lists > Public > www-tag@w3.org > November 2011

Re: Fwd: URL parsing in HTML5

From: Peter Saint-Andre <stpeter@stpeter.im>
Date: Thu, 03 Nov 2011 21:57:41 -0700
Message-ID: <4EB370C5.5020504@stpeter.im>
To: Noah Mendelsohn <nrm@arcanedomain.com>
CC: "www-tag@w3.org" <www-tag@w3.org>
Thanks, Noah. This will be discussed before 11 AM tomorrow morning at
the HTML WG, so I might need to slip out of the SPDY discussion for a
while at some point.

On 11/3/11 9:52 PM, Noah Mendelsohn wrote:
> This e-mail from Peter Saint-Andre to the public-iri mailing list may be
> of interest to the TAG.
> Noah
> -------- Original Message --------
> Subject: URL parsing in HTML5
> Resent-Date: Fri, 04 Nov 2011 04:22:35 +0000
> Resent-From: public-iri@w3.org
> Date: Thu, 03 Nov 2011 21:21:50 -0700
> From: Peter Saint-Andre <stpeter@stpeter.im>
> To: public-iri@w3.org <public-iri@w3.org>, public-html-comments@w3.org
> CC: Sam Ruby <rubys@intertwingly.net>,  "Paul Cotton
> (pcotton@microsoft.com)" <pcotton@microsoft.com>, Ian Hickson
> <ian@hixie.ch>, "Michael(tm) Smith" <mike@w3.org>,  Adam Barth
> <ietf@adambarth.com>, Edward O'Connor <ted@oconnor.cx>
> After chatting during TPAC 2011 with Addison, Larry, Richard, Ian, Mike,
> Ted, Julian (etc.), I'd like to share some thoughts about a possible
> compromise / resolution regarding Issue 56 in the HTML WG:
> http://www.w3.org/html/wg/tracker/issues/56
> Some observations and opinions:
> 1. It is unlikely that existing browsers will change their current URL
> parsing behavior. (I am not judging whether that behavior is good or bad.)
> 2. Documentation of that behavior is out of scope for the revisions to
> RFC 3987, and outside the charter of the IRI WG, because it's a matter
> of URI [pre-]processing (RFC 3986) and not IRI processing (RFC 3987).
> 3. It is unlikely that RFC 3986 will ever be modified to recommend the
> current behavior, and simply impossible before HTML5 is advanced at the
> W3C (even if such modifications were desirable).
> 4. As far as I can see, the current behavior is in fact out of scope for
> RFC 3986 and any future possible revisions to RFC 3986 because:
>    (a) it is mostly or completely a matter of pre-processing of strings
>    that look like URIs/URLs/"web-addresses" -- we could call these
>    "candidate strings" or "proto-URLs" or somesuch to disambiguate them
>    from URIs
>    (b) this pre-processing behavior is applied only in the web context
>    by browsers and software applications that want to be consistent
>    with browsers
>    (c) because of (b), there is no great danger that this behavior will
>    "leak" into processing of URIs in general (mailto:, sip:, tel:,
>    URNs, and so on)
> 5. There's no necessity for work on documentation of the current URL
> parsing behavior to happen at the IETF, given that it's out of scope for
> the IRI WG. Although this work could be done as an individual (non-WG)
> I-D at the IETF, I think it could more easily be done at the W3C, either
> as part of the HTML specification or as a separate document (the latter
> might be preferable so that it can be reviewed in a more focused manner
> and referenced more easily by other W3C specifications, but naturally I
> would leave such decisions up to folks at the W3C). [The IRI WG is still
> responsible for rfc3987bis, but that's off-topic for this email message.]
> If folks can agree on the foregoing points, then I think it would be
> productive to work on proposed revisions to the current text (or at
> least what I believe is the current text):
> http://www.w3.org/TR/html5/Overview.html#parsing-urls
> I would be happy to make concrete suggestions during that revision
> process if someone from the W3C could point to the preferred venue or
> process (e.g., wiki page or bugzilla comments).
> I look forward to discussing this further tomorrow morning during the
> HTML WG session:
> http://lists.w3.org/Archives/Public/public-html/2011Nov/0013.html
> Peter
> -- 
> Peter Saint-Andre
> https://stpeter.im/

Peter Saint-Andre
Received on Friday, 4 November 2011 04:58:09 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:56:41 UTC