Re: URL work in HTML 5

On 2012/09/25 17:49, Julian Reschke wrote:
> On 2012-09-25 10:31, Robin Berjon wrote:
>> On 25/09/2012 04:29 , Larry Masinter wrote:
>>> I think there was a group willing to consider the redefinition of
>>> URLs in HTML5 as a local anomaly within HTML, in a way that didn’t
>>> really affect any other format or application.
>> My understanding is that Anne is working on an improved definition of
>> URLs because he noticed demonstrable severe interoperability issues with
>> tasks as deceivingly simple as parsing URLs.
>> Has anyone in this thread taken if only five minutes to perhaps peruse
>> the evidence and see if he might not have a point? I ask because I've
>> given it a cursory look and what I've seen is ugly.
> Of course there is a point. The specs (RFCs 3986 and 3987) do not define
> how to treat broken identifiers. Furthermore, references in HTML
> definitively do require preprocessing (such as dropping leading
> whitespace, or potentially rewriting query parts when not in UTF-8)
> before they can be handled as URIs/IRIs.
> This is not a new discussion.

Fully agree. Indeed, lots of attempts have been made to try and describe 
what browsers actually do with goop they find in a@href, img@src, and 
the like. If Anne can pull that off, then hats off to him. But given the 
current divergences between browsers, it may not exactly be easy.

> I believe that this can be best handled by acknowledging that what HTML
> uses are identifiers that need some level of sanitization before they
> can be treated as URI/IRI (references).
> It appears that Anne's approach is to pretend that the RFCs are broken
> and need to be completely replaced. This of course ignores that fact
> that they are widely implemented outside browsers.
> What we IMHO need is a *precise* problem statement, and then a mapping
> layer.

I don't think the problem statement is too difficult. What Anne is after 
is implementation instructions for browsers. That's a good thing to 
have. But for somebody creating an URI or IRI, or creating an URI/IRI 
scheme, browser quirks can and should be irrelevant. It would be 
hopelessly confusing for them to look at Anne's document.

Regards,   Martin.

> Also, it's not helpful that terminology from 3986 ("resolve") is used
> for something else, leading even to more confusion.
>> ...
> Best regards, Julian

Received on Tuesday, 25 September 2012 10:49:00 UTC