- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Mon, 22 Mar 2010 20:16:02 -0400
- To: "Tab Atkins Jr." <jackalmage@gmail.com>
- CC: HTMLwg WG <public-html@w3.org>
On 3/22/10 6:59 PM, Tab Atkins Jr. wrote: > On Mon, Mar 22, 2010 at 3:39 PM, Maciej Stachowiak<mjs@apple.com> wrote: >> Another possibility is to change parsing such that "...©=..." is not a >> hazard. I believe you had some evidence that this would fix more sites than >> it would break. It seems like it would also have the benefit of allowing us >> to make authoring rules more lenient in a beneficial way, without at the >> same time introducing undue complexity. > > If at all possible, this is what I'd prefer. I've never consciously > escaped an ampersand in a URL in my life (and luckily don't think that > I've ever run into a situation where it got interpreted as a named > entity). I'd prefer, if possible, to continue avoiding escaping the > ampersand. Unicode exists for a reason, after all. If I want a > copyright symbol, I can just pop that character itself into the URL. So as I see it, the options are: 1) Disallow entity references in all attributes. 2) Accept the fact that moving text from attribute A to attribute B via DOM manipulation may well not give the same results as having the text in attribute B to start with. 3) Allow entity references in all attributes, and require & in URIs as needed. #3 is what we have right now, right? Is there any indication that #1 is safe to do (as in, doesn't break the web)? If the weirdness of #2 outweighed by not needing to escape '&' in URIs? My gut feel is that #1 is not actually a viable option and that one's take on #2 depends heavily on the authoring workflow and the extent to which pages are scripted... -Boris
Received on Tuesday, 23 March 2010 00:16:38 UTC