Re: Shrinking HTML5 some more — Anne’s Weblog

On Mar 30, 2009, at 5:30 AM, Sam Ruby wrote:

> Anne van Kesteren wrote:
>> On Mon, 30 Mar 2009 12:25:24 +0200, Julian Reschke <julian.reschke@gmx.de 
>> > wrote:
>>> Anne van Kesteren wrote:
>>>> On Mon, 30 Mar 2009 12:05:07 +0200, Julian Reschke <julian.reschke@gmx.de 
>>>> > wrote:
>>>>> Anne van Kesteren wrote:
>>>>>> LEIRIs are not a solution.
>>>>>
>>>>> Please elaborate.
>>>> The reasons why LEIRIs is not a solution are:
>>>>   1) URL character encoding flag (affects the query component)
>>>
>>> That's a special case for HTML, and I think it should be handled  
>>> over there.
>> It affects some APIs as well (e.g. window.location). This is one  
>> the main reasons HTML5 has a separate URL section. If we do not  
>> want such a section we need to address it elsewhere.
>>>>  3) Potential other differences between LEIRIs and URLs
>>>
>>> And these we obviously would need to check. So which are these?  
>>> (Given the fact, that the specification of LEIRIs is work-in- 
>>> progress)
>> I don't know.
>
> So other than the name LEIRI and definition of same, and where this  
> should be documented, we are all in agreement? :-)
>
> I wasn't aware that Martin was actively revising the IRI  
> specification.  And I sense that others at the IETF/HTML5 meeting  
> weren't either.
>
> Is someone willing to volunteer to work with and M. Duerst and M.  
> Suignard to see to it that the revision is something that is usable  
> by the HTML 5 Working Draft?
>
> Alternatives: DanC continues to pursue a separate draft, or Ian  
> continues include this section in HTML 5.

Pardon if this is stating the obvious, but to make an informed  
decision we need to know the following:

1) What specifically are the processing differences between LEIRIs and  
Web Addresses (or HTML5 pre-processed URIs if you prefer)? It seems  
like some of these are know, but not all.

2) Are the current IRI/LEIRI editors amenable to making changes for  
real-world compatibility?

 From my point of view, it is essential to have a specification that  
correctly describes what user agents must do to process URLs in public  
Web content. It would be even better if that were the primary  
officially IETF spec for *R*s, but having at least one correct  
specification is more important than consolidation. So we should  
figure out if a unified spec would be able to match real-world  
constraints before we sign on for it.

Having worked on Safari since its inception, I very clearly recall the  
amount of crazy reverse engineering we had to do to figure out URL  
processing, after initially naively assuming that the URI RFC  
specified what we had to do. It took multiple releases to get closer  
to the actual de facto standard, and I'm not even 100% sure we are  
there today. This is definitely a significant barrier to entry for any  
tool that wants to process Web content and we should definitely work  
to get it fixed.

Regards,
Maciej

Received on Monday, 30 March 2009 20:03:58 UTC