W3C home > Mailing lists > Public > www-tag@w3.org > September 2012

Re: URL work in HTML 5

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Tue, 25 Sep 2012 19:46:55 +0900
Message-ID: <50618B9F.6080101@it.aoyama.ac.jp>
To: Julian Reschke <julian.reschke@gmx.de>
CC: Robin Berjon <robin@w3.org>, Larry Masinter <masinter@adobe.com>, W3C TAG <www-tag@w3.org>
On 2012/09/25 17:49, Julian Reschke wrote:
> On 2012-09-25 10:31, Robin Berjon wrote:
>> On 25/09/2012 04:29 , Larry Masinter wrote:
>>> I think there was a group willing to consider the redefinition of
>>> URLs in HTML5 as a local anomaly within HTML, in a way that didn’t
>>> really affect any other format or application.
>>
>> My understanding is that Anne is working on an improved definition of
>> URLs because he noticed demonstrable severe interoperability issues with
>> tasks as deceivingly simple as parsing URLs.
>>
>> Has anyone in this thread taken if only five minutes to perhaps peruse
>> the evidence and see if he might not have a point? I ask because I've
>> given it a cursory look and what I've seen is ugly.
>
> Of course there is a point. The specs (RFCs 3986 and 3987) do not define
> how to treat broken identifiers. Furthermore, references in HTML
> definitively do require preprocessing (such as dropping leading
> whitespace, or potentially rewriting query parts when not in UTF-8)
> before they can be handled as URIs/IRIs.
>
> This is not a new discussion.

Fully agree. Indeed, lots of attempts have been made to try and describe 
what browsers actually do with goop they find in a@href, img@src, and 
the like. If Anne can pull that off, then hats off to him. But given the 
current divergences between browsers, it may not exactly be easy.

> I believe that this can be best handled by acknowledging that what HTML
> uses are identifiers that need some level of sanitization before they
> can be treated as URI/IRI (references).
>
> It appears that Anne's approach is to pretend that the RFCs are broken
> and need to be completely replaced. This of course ignores that fact
> that they are widely implemented outside browsers.
>
> What we IMHO need is a *precise* problem statement, and then a mapping
> layer.

I don't think the problem statement is too difficult. What Anne is after 
is implementation instructions for browsers. That's a good thing to 
have. But for somebody creating an URI or IRI, or creating an URI/IRI 
scheme, browser quirks can and should be irrelevant. It would be 
hopelessly confusing for them to look at Anne's document.

Regards,   Martin.

> Also, it's not helpful that terminology from 3986 ("resolve") is used
> for something else, leading even to more confusion.
>
>> ...
>
> Best regards, Julian
>
>
>
Received on Tuesday, 25 September 2012 10:49:00 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 25 September 2012 10:49:00 GMT