Re: URL parsing in HTML5

On Fri, 04 Nov 2011 09:25:15 -0700, Julian Reschke <julian.reschke@gmx.de>  
wrote:
> On 2011-11-04 16:58, Anne van Kesteren wrote:
>> details of URL processing are besides the point here. The point is that
>> URL processing should be uniform. What the exact details of URL
>> processing should be is indeed not completely figured out just yet, but
>> it is clear that the IETF specifications on the matter are fiction.
>
> Well, so is that the HTML spec says. The problem is to pretend that it's  
> possible to agree on the same error handling for everybody.

We have crossed that bridge for much more complex problems, such as HTML  
parsing, so I think it should be doable.


> We spent tons of emails on the IRI mailing list to figure out *which*  
> "willful violations" of RFC 3986 UAs implementers agree on, and didn't  
> really find a lot.

That does not mean we do not want to converge.


>> You keep bringing this example up and I will remind you once again that
>> obviously you would have to split on whitespace characters first in such
>> cases. This has does not affect uniform URL processing in the slightest,
>> it just means we should either require whitespace characters in URLs to
>> always be escaped, or require whitespace characters in URLs to be
>> escaped in cases where URLs are whitespace separated.
>
> It means that you have at least *two* processing algorithms, no matter  
> how you rephrase it .-)

Yes, you need two algorithms because standalone URLs and whitespace  
separated URLs are distinct. What are you trying to say?


-- 
Anne van Kesteren
http://annevankesteren.nl/

Received on Friday, 4 November 2011 16:36:19 UTC