- From: Ian Hickson <ian@hixie.ch>
- Date: Mon, 30 Jun 2008 06:06:06 +0000 (UTC)
- To: Julian Reschke <julian.reschke@gmx.de>
- Cc: uri@w3.org, HTML WG <public-html@w3.org>
On Mon, 30 Jun 2008, Julian Reschke wrote: > > With question marks, there will be data loss. You may or may not notice > it, because the page you get may look ok (for instance, it depends on > how important that part of the query was). If you notice that something > is wrong, then, yes, spotting the question mark may help. If you > understand the issue itself. For how many users is that the case? > > With UTF-8/percent-escaping, the page may very well work as desired, > because the server happens to understand that encoding There is no question that always using UTF-8 would be better than the current mess. > (see Google case cited in Webkit bug report). Do you mean the case that gets converted to &#...;? That's not UTF-8. (If you mean something else, could you provide a link?) > Finally, if you copy & paste the URL, you wouldn't see the replacement > characters anyway, right? In which case the default handling (using > UTF-8) would apply; which even more is a reason to consider making this > mandatory (because otherwise following the link inside the document and > the copy/paste case yield different results). Having the encoding be essentially random is far worse than converting the character to a question mark, IMHO. Anyway, the whole issue is easily avoided by authors by just using UTF-8. This entire problem can only be reached in invalid documents anyway. > > > I care because I'd like to see documents using non-ASCII characters > > > in query parts become compliant no matter what encoding they are in. > > > > Unless we change the definition of HTML5's URLs to be conforming even > > when those URLs would not be treated as IRIs, I don't see any way to > > get there from here. > > We could break the affected pages and/or add a mechanism through which > pages can opt-in into the sane UTF-8 based behavior. Breaking the pages isn't an option, and an opt-in is already available: use UTF-8. This issue is not even remotely important enough on the grand scale of things to deserve special syntax or options or whatnot. > > The HTMLWG is only a small part of the broad range of places from > > which I take input, which includes hundreds of blogs, at least three > > separate bug systems, multiple other mailing lists, face to face > > discussions, IRC conversations on dozens of channels and privately, > > private e-mails, etc. I try to keep as much of the discussions to the > > HTMLWG and WHATWG lists, but the sheer volume of traffic that would be > > generated by archiving all the sources of input on public-html would > > be staggering, and that's without even considering whether all those > > people would actually be willing to have their input forwarded in that > > way. > > In which case it seems to me we have a big process problem. My goal is to get a good specification and bring the Web forward, not to follow process, so that's quite possible, yes. I'm certainly not going to start putting process ahead of getting quality feedback. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Monday, 30 June 2008 06:06:47 UTC