Re: Proposed Charter and Agenda for IRI BOF at IETF 76

The definition for how to perform forgiving processing of resource  
identifiers originally started out in the HTML5 spec, where you  
suggest it should go. However, it was moved to a separate document  
based on strong objections from many parties. I understand from the  
below that your objection was solely to the use of the term "URL", and  
not to these processing rules being in the HTML spec. But that was not  
the sole objection. Many thought it was architecturally wrong to  
define these rules in the HTML spec. Thus, while I'm sure Ian Hickson  
would be perfectly happy to put the processing requirements back in  
HTML5, I'm not sure that is an acceptable long-term solution.

Furthermore, besides the general architectural objection, there may be  
applications and technologies that wish to use HTML-style loose  
processing rules. Having those rules in the HTML spec instead of in a  
standalone specification makes it more difficult to reuse the  
technology.

On a more philosophical level: a lot more resource identifiers are  
extracted from attributes in HTML documents than from the sides of  
busses. It is not clear to me why the side-of-bus use case should be  
privileged. IRIs are a standard for the Internet, not for vehicular  
advertising. And indeed, many print ads these days drop the initial  
http: from the addresses they print.

For an Internet standard, there is nothing wrong with defining rules  
for lenient processing as well as the syntax of strictly conforming  
input. Doing so can convert "experiment[s] in forgiveness" into  
interoperability.

Regards,
Maciej

On Sep 25, 2009, at 12:19 PM, Roy T. Fielding wrote:

> Larry, your changes to the IRI draft make it incomprehensible.
>
> I think this is getting ridiculous.  We don't need a working group.
> We don't even need an updated draft, at least not for LEIRI, Href,
> and whatever it is that we call HTML5 references.
>
> HTML5 wants to specify the *process* of taking arbitrary data entry
> in various places and transforming it into a) something the browser
> displays, and b) a URI for use on the wire.  What they are calling URL
> is the arbitrary data entry part, NOT the resulting URI, which is why
> it is so frigging annoying and inconsistent with all other standards.
> LEIRI made the same mistake.
>
> The purpose of IRI is to specify the allowed syntax for what one
> might see on the side of a bus as a Web address in i18n-friendly,
> human-readable form.  That is why the IRI syntax does not allow
> common delimiters like whitespace, quotes, and brackets (except
> for IPv6 literals).  It does not define a data-entry box.
>
> URI is in the same boat, except that it also defines the allowed
> syntax for on-the-wire usage in HTTP, etc.  It is intentionally
> limited for use in embedded plain text.  It does not define a
> data entry box.
>
> Both IRI and URI are intended to define standards for the Internet
> in the same way as the US Postal Service residential addresses have
> a standard normal form.  The fact that an envelope does not prevent
> a person from writing an arbitrary form of address in the hope that
> a mail carrier can interpret it for them is not an indication that
> the standard is somehow "wrong" -- what matters is that following
> the standard is known to be interoperable, and everything else is
> just an experiment in forgiveness.
>
> What HTML5 wants to define is how to process a data entry box
> in the same way across all browser implementations, and there is
> nothing wrong with such a definition appearing in HTML5 *except*
> for the fact that the editor has chosen an existing well-known
> term that means something else to describe it, which conflicts
> with all prior uses of that term.  Just stop that nonsense by
> changing the HTML5 draft wording to talk about references, not URLs.
> HTML5 does not require changes to IRI, and certainly not to URI.
>
> Changing IRI (or URI) so that it conforms both to the side of a
> bus definition and a data entry definition is insane.  They are
> not the same thing.  They do not share the same concerns.  A
> reference might allow anything, depending on its context and the
> technology used to parse it; it is the post-processing that
> produces an IRI/URI.
>
> ....Roy
>

Received on Sunday, 27 September 2009 00:13:42 UTC