- From: Maciej Stachowiak <mjs@apple.com>
- Date: Sat, 26 Sep 2009 18:30:33 -0700
- To: "Roy T. Fielding" <fielding@gbiv.com>
- Cc: Larry Masinter <masinter@adobe.com>, "PUBLIC-IRI@W3.ORG" <PUBLIC-IRI@w3.org>
On Sep 26, 2009, at 6:06 PM, Roy T. Fielding wrote: > On Sep 26, 2009, at 5:13 PM, Maciej Stachowiak wrote: > >> The definition for how to perform forgiving processing of resource >> identifiers originally started out in the HTML5 spec, where you >> suggest it should go. However, it was moved to a separate document >> based on strong objections from many parties. I understand from the >> below that your objection was solely to the use of the term "URL", >> and not to these processing rules being in the HTML spec. But that >> was not the sole objection. Many thought it was architecturally >> wrong to define these rules in the HTML spec. Thus, while I'm sure >> Ian Hickson would be perfectly happy to put the processing >> requirements back in HTML5, I'm not sure that is an acceptable long- >> term solution. > > I think it is hopeless to trace back all the screwed-up > misunderstandings > of Web architecture that led to anyURI, LEIRI, and now HTML5-URL. > I think I explained how it is supposed to work, succinctly and to the > point where actual text can be applied to the HTML5 draft that will > resolve all objections and settle this matter once and for all. > If not, then we can deal with those new objections when they arise. I think removing the use of the term "URL" from HTML5 would remove some objections, but I don't think folding the text of Web Address into HTML5 would address any objections, except perhaps the concern about lack of timely progress in this area. > >> Furthermore, besides the general architectural objection, there may >> be applications and technologies that wish to use HTML-style loose >> processing rules. Having those rules in the HTML spec instead of in >> a standalone specification makes it more difficult to reuse the >> technology. > > Those rules already exist in RFC3986, Appendix B. What does not > exist there is the behavior after parsing into the components, > since that behavior is entirely application-dependent. If HTML5 > wants to define that behavior, it can do so only if the requirements > are stated to be specific to browser-like applications. As far as I can tell, RFC3986 *only* defines how to extract components. It does not define how to turn an arbitrary string into a URI, which is potentially needed for HTTPbis. It does not define how to perform a relative resolution on a possibly-invalid reference against a possibly-invalid base. That being said, I think what RFC3986 Appendix B says is a good definition of how to extract components from possibly-invalid strings. It seems way easier to understand than what the Web Address draft says, and better matches what implementations actually do. > >> On a more philosophical level: a lot more resource identifiers are >> extracted from attributes in HTML documents than from the sides of >> busses. It is not clear to me why the side-of-bus use case should >> be privileged. IRIs are a standard for the Internet, not for >> vehicular advertising. And indeed, many print ads these days drop >> the initial http: from the addresses they print. > > Also explained in 3986. I don't remember if that was copied into > 3987. Either way, the upshot is that strings may appear in bus ads that are not allowed to appear in format or protocol elements that require an IRI. > >> For an Internet standard, there is nothing wrong with defining >> rules for lenient processing as well as the syntax of strictly >> conforming input. Doing so can convert "experiment[s] in >> forgiveness" into interoperability. > > There is nothing wrong with defining correct processing rules for > whatever thing you are trying to process, whether those rules be > strict or lenient. The problem is saying that the rules are for > processing X when in fact you are actually processing Y and then > unilaterally declaring that Y is the new definition of X. I don't think Larry proposed to do that. He just suggested that a certain form of reusable lenient processing rules should be in the same spec as the normative definition of an IRI. I don't think he suggested that these rules should redefine what an IRI is. Regards, Maciej
Received on Sunday, 27 September 2009 01:31:14 UTC