- From: Roy T. Fielding <fielding@gbiv.com>
- Date: Wed, 25 Jun 2008 20:19:52 +0200
- To: Ian Hickson <ian@hixie.ch>
- Cc: URI <uri@w3.org>
On Jun 25, 2008, at 7:33 AM, Ian Hickson wrote: > Standards, for the purposes of the HTML5 effort, are comprehensive > documentation intended to make it possible to implement user > agents, and > are thus very much not abstractions. That is obviously the definition of an implementation specification, not a standard. > This isn't intended to disparage other beliefs or opinions as to what > standards should be. I have no problem with standards that, e.g., > leave > error handling undefined -- they are just not really relevant to > the HTML5 > work. At this rate, the feeling will be mutual. Why don't you just contribute that documentation to the Mozilla website and be done? > You seem to be conflating the authoring requirements and the user > agent > requirements. The authoring requirements for HTML5 are just "it > must be a > valid URI or IRI". That however has little bearing on what the user > agent > conformance requirements are. The UA requirements have to handle all > manner of things that _aren't_ valid URIs or IRIs, since in > practice such > invalid content is prevalent. To answer your original questions, you don't need to know the scheme of the base URI in order to parse a URI reference. You do need to know it to convert a relative reference to an absolute reference, but only to the extent that you need to know the string in order to copy it. There may be a few implementations that do it differently than what has been defined in STD 66. I don't care. STD 66 will never be changed to suit those implementations because there are a hundred that do it right for every one that is wrong (and those numbers improve every week as old code disappears). How an HTML form constructs a query string is entirely defined by HTML. The only thing defined by URI in that case is what characters are allowed in the identifier set, and that's because of what is required when the URI is sent outside of the HTML-construction context. HTML is only one of many hundreds of data formats that use URI. HTML cannot change the definition of URIs. The contents of href="whatever" are not a URI -- they are characters that are processed as per SGML CDATA (IIRC) to transform it into a sequence of characters in the document character set, which are then considered by the HTML processor as data for the href attribute (whatever that means, it is defined by HTML, not by URI). If HTML says that the valid data is limited to a URI in the document character set (which is presumably mapped to ASCII when sent outside the DOM), then the data either conforms to STD 66 or it is invalid. What the browser does when it sees invalid data is entirely defined by the browser and (sometimes) its configuration. It has no relevance whatsoever to the URI specification because it is not and never was a URI. The URI spec defines identifiers, not href attributes. The only result that matters is that the invalid data is not used by sending it out of the DOM, such as by sending it as an invalid HTTP request. There is no chance that HTML5 will ever exist as a finished document if it requires the sending of invalid HTTP requests as part of its HTML implementation specification. No, it doesn't matter how many different implementations handle invalid data in different ways. You can repeat those imaginary goals of HTML5 til the end of days and it still won't matter. The right way to handle invalid data is to refuse to use it, where "use" is entirely dependent on the context where it occurs. I don't care what MSIE does with invalid URI references. I do care what Firefox, Safari, and WebKit do with invalid URI references, but only because I prefer to have them highlighted/rejected rather than used. The implementations I create refuse or reject invalid data because to do anything else is going to be a security hole to someone, somewhere, and it is simply irresponsible to repeat whatever mistakes were made when hacking Mosaic in 1993. ....Roy
Received on Wednesday, 25 June 2008 18:20:27 UTC