[whatwg] Proposal for improved handling of '#' inside of data URIs

?ann lau 10.sep 2011 21:15, skrifa?i Daniel Holbert:
> * Opera is interesting -- it can exhibit either the Firefox or WebKit
> behaviors in tests A/B/C, depending on whether the data URI as an
> embedded element (via iframe/img) or view it directly. When you view it
> as an embedded element (in my testcase), Opera matches WebKit on A/B/C
> (including the XML parse error on C). However, if you *directly view*
> the data URIs (right-click on iframe, Frame|Open, focus URLbar & hit
> enter), then Opera matches Firefox. Also, Opera passes test D.
>
So Opera treats the src attribute as a URI, but the href attribute and
identifiers input by users as URI references?  This does not conform to 
the WHATWG HTML5 standard that uses delegates the definition of URI to 
RFC 3986 which again defines the "#" character as a beginning a fragment 
identifier, and not quite to RFC 2396, delegated to by HTML 4.01 which 
forbids "#"s in URIs, but uses it as a separator between URIs and URI 
references (but doesn't specify how to parse URIs who are not part of 
URI references). I believe the HTML 4.01 usage of the term URI instead 
of URI reference to be an error, but the HTML working group has to 
confirm that (or editor(s) on it's behalf).
According to my interpretation of RFC 2396 the "#" should terminate the 
URI, as URI can't contain "#", but this isn't stated explicitly and thus 
I can't tell if Opera violates the RFC or not, and thus not if it's 
"correct" or not. It's clearly violating RFC 3986, however.
The correct thing to do seems to be to to violate HTML 4.01 & RFC 2396 
but conform to HTML5 & RFC 3986. Adding a special case for one URI 
scheme seems a little odd, but I can't think of a use case for fragment 
identifiers in data URI.

Received on Saturday, 10 September 2011 17:30:39 UTC