Re: Proposed Charter and Agenda for IRI BOF at IETF 76 from Maciej Stachowiak on 2009-09-27 (public-iri@w3.org from September 2009)

From: Maciej Stachowiak <mjs@apple.com>
Date: Sat, 26 Sep 2009 17:19:41 -0700
To: Mark Nottingham <mnot@mnot.net>
Cc: "Roy T. Fielding" <fielding@gbiv.com>, Larry Masinter <masinter@adobe.com>, "PUBLIC-IRI@W3.ORG" <PUBLIC-IRI@w3.org>
Message-id: <57796567-D80F-4C8E-A84A-B9744FB33C8C@apple.com>
On Sep 26, 2009, at 5:01 PM, Mark Nottingham wrote:

>
> On 27/09/2009, at 9:37 AM, Maciej Stachowiak wrote:
>
>>
>> On Sep 26, 2009, at 3:04 PM, Mark Nottingham wrote:
>>
>>> I agree with Roy, and would add that this document hides the new  
>>> information (i.e., how to get from random bits to a valid URI or  
>>> IRI) too deeply; for example, if HTTPbis wanted to reference this  
>>> thing, it would need to do so by specifying a section in the IRI  
>>> spec, even though HTTP doesn't use IRIs at all.
>>>
>>> What I'd like to see is:
>>> a. A revision of the IRI spec (if necessary), and
>>> b. A new spec defining how to get from random bits to a URI or an  
>>> IRI (allowing the application to choose which one it needs to end  
>>> up with).
>>>
>>> Then, different specs can refer to URIs if they want to, IRIs if  
>>> they want to, and optionally specify this processing as a step  
>>> beforehand, and do so clearly.
>>
>> It seems like the only difference between your proposal and Larry's  
>> is whether strict processing of IRIs and lenient processing of  
>> strings that may or may not be valid IRIs are in the same spec or  
>> two separate specs.
>>
>> I don't see a great advantage in splitting the specs, as this makes  
>> cross-references more complicated.
>
> I don't have a lie-down-in-the-road issue with structuring these as  
> one document, although I do think it's more natural to separate  
> them. What I want to avoid is having this extra step hidden away in  
> a non-obvious place that's difficult to reference and specify  
> externally; as it currently sits, the processing is specified in an  
> informally named section of the IRI spec, which is the last place  
> I'd look for it if I were working with URIs.
>
> So, at a minimum, the section needs to be re-cast as something more  
> prominent and normative (i.e., if someone chooses to conform to it,  
> they should be able to know what that means), and the spec needs to  
> be named to reflect that.

I agree that the rules should be more prominent and normative. The  
HTML spec, and any other referencing spec, should be able to cite a  
specific section for the algorithm to convert a loosely-processed  
reference into a URI, or to perform a lenient resolution relative to a  
base, or whatever. And that algorithm should be normatively defined,  
even if it is only applicable to cases where other specs require  
lenient processing.

[...snip...]

>>
>> Note: it's not clear to me why HTTPbis would want to reference  
>> lenient processing rules for URIs/IRIs. Are HTTP servers and  
>> proxies not strict in what they accept?
>
> It's been discussed for the Location header. No decision as of yet,  
> though.
>
> If something like Location (i.e., something that needs a URI, not an  
> IRI, as output) needs this algorithm, including this in the IRI spec  
> is going to make things more complex.

It seems like putting these rules in the HTML5 spec would make things  
even harder than that for the Location header, since it would create a  
dependency inversion. So perhaps you don't entirely agree with Roy  
after all?

Regards,
Maciej


>
>>
>> Regards,
>> Maciej
>>
>> [1] http://www.w3.org/html/wg/href/draft.html
>>
>>>
>>>
>>> On 26/09/2009, at 5:19 AM, Roy T. Fielding wrote:
>>>
>>>> Larry, your changes to the IRI draft make it incomprehensible.
>>>>
>>>> I think this is getting ridiculous.  We don't need a working group.
>>>> We don't even need an updated draft, at least not for LEIRI, Href,
>>>> and whatever it is that we call HTML5 references.
>>>>
>>>> HTML5 wants to specify the *process* of taking arbitrary data entry
>>>> in various places and transforming it into a) something the browser
>>>> displays, and b) a URI for use on the wire.  What they are  
>>>> calling URL
>>>> is the arbitrary data entry part, NOT the resulting URI, which is  
>>>> why
>>>> it is so frigging annoying and inconsistent with all other  
>>>> standards.
>>>> LEIRI made the same mistake.
>>>>
>>>> The purpose of IRI is to specify the allowed syntax for what one
>>>> might see on the side of a bus as a Web address in i18n-friendly,
>>>> human-readable form.  That is why the IRI syntax does not allow
>>>> common delimiters like whitespace, quotes, and brackets (except
>>>> for IPv6 literals).  It does not define a data-entry box.
>>>>
>>>> URI is in the same boat, except that it also defines the allowed
>>>> syntax for on-the-wire usage in HTTP, etc.  It is intentionally
>>>> limited for use in embedded plain text.  It does not define a
>>>> data entry box.
>>>>
>>>> Both IRI and URI are intended to define standards for the Internet
>>>> in the same way as the US Postal Service residential addresses have
>>>> a standard normal form.  The fact that an envelope does not prevent
>>>> a person from writing an arbitrary form of address in the hope that
>>>> a mail carrier can interpret it for them is not an indication that
>>>> the standard is somehow "wrong" -- what matters is that following
>>>> the standard is known to be interoperable, and everything else is
>>>> just an experiment in forgiveness.
>>>>
>>>> What HTML5 wants to define is how to process a data entry box
>>>> in the same way across all browser implementations, and there is
>>>> nothing wrong with such a definition appearing in HTML5 *except*
>>>> for the fact that the editor has chosen an existing well-known
>>>> term that means something else to describe it, which conflicts
>>>> with all prior uses of that term.  Just stop that nonsense by
>>>> changing the HTML5 draft wording to talk about references, not  
>>>> URLs.
>>>> HTML5 does not require changes to IRI, and certainly not to URI.
>>>>
>>>> Changing IRI (or URI) so that it conforms both to the side of a
>>>> bus definition and a data entry definition is insane.  They are
>>>> not the same thing.  They do not share the same concerns.  A
>>>> reference might allow anything, depending on its context and the
>>>> technology used to parse it; it is the post-processing that
>>>> produces an IRI/URI.
>>>>
>>>> ....Roy
>>>>
>>>
>>>
>>> --
>>> Mark Nottingham     http://www.mnot.net/
>>>
>>>
>>
>
>
> --
> Mark Nottingham     http://www.mnot.net/
>
Received on Sunday, 27 September 2009 00:20:23 UTC