Re: Shrinking HTML5 some more — Anne’s Weblog

On Mar 30, 2009, at 1:03 PM, Maciej Stachowiak wrote:
> From my point of view, it is essential to have a specification that  
> correctly describes what user agents must do to process URLs in  
> public Web content. It would be even better if that were the  
> primary officially IETF spec for *R*s, but having at least one  
> correct specification is more important than consolidation. So we  
> should figure out if a unified spec would be able to match real- 
> world constraints before we sign on for it.

There are no known or reported errors in RFC 3986.  There are a few
errors in implementations, but no consistency of such errors to
justify any changes to STD66.  If you think there is, then kindly
provide a list of such errors.

> Having worked on Safari since its inception, I very clearly recall  
> the amount of crazy reverse engineering we had to do to figure out  
> URL processing, after initially naively assuming that the URI RFC  
> specified what we had to do. It took multiple releases to get  
> closer to the actual de facto standard, and I'm not even 100% sure  
> we are there today. This is definitely a significant barrier to  
> entry for any tool that wants to process Web content and we should  
> definitely work to get it fixed.

Are you talking about how to handle character encoding within
attribute values?  That is an HTML parsing issue.  The URI spec
does not define that because it is orthogonal to URI usage across
all of the different formats and implementations.  The URI spec
doesn't even come into scope until after the string is converted
from its data-entry format into a URI reference.

....Roy

Received on Tuesday, 31 March 2009 02:19:39 UTC