Re: scope of applicability for IRIs

On 2009/06/21 8:36, Larry Masinter wrote:
> John Klensin wrote (on
>
>> I do believe that there has
>> been a long-term, and often largely hidden, disagreement about
>> the applicability of IRIs --whether they are about standardizing
>> a user interface element or whether they are expected to act as
>> protocol elements-- that complicates these discussions.
>
>
> My original goal -- in defining "IRI" as separate from "URI" --

[For people less familiar with the history, Larry is talking about the 
very start of the IRI spec, which goes back to 1997, at
http://tools.ietf.org/html/draft-masinter-url-i18n-00]

> was that the applicability of IRIs was to be independently
> determined. That is, applications and protocols would *choose*
> whether to reference the URI document or the IRI document
> (and a specific non-terminal within the IRI document.)
>
> That is, specifications which cited URI would not automatically
> be "upgraded" to use IRIs, but rather must explicitly choose.
>
> I think this is possible with the URI ->  IRI path, and that
> it has been explicit, although a bit haphazard.

Do you mean explicit in the IRI spec, or explicit for the affected 
'user' specs? For the former, see below. For the former, I think it's 
also more or less true, although for HTML, "dealing with IRIs" started 
as advice on how to deal with "erroneous" URIs (URIs that contained 
non-ASCII characters), and for many other specs, the fact that they use 
what is essentially an IRI was expressed originally only 
circumscriptively ("use UTF-8 and then %-encoding to create an URI"), 
rather than by reference. That's also mostly the reason for the 
'divergent practice' that Larry talks about in the following paragraph.

> Unfortunately, because of divergent practice, there are
> more than one non-terminal protocol elements in common use
> which require documentation, including LEIRI for reference
> by XML-based specifications and HREF (a.k.a. Web Address)
> explicitly to match HTML5. I don't see any way to avoid
> the divergence, even if discouraging it.

I basically agree, although "don't see a way to avoid the divergence" 
might be interpreted as the divergence getting bigger and bigger. That 
should definitely not be the case, as we can, at this point in time, 
nail down the various divergent behaviors, and strongly advise against 
them (already done for LEIRIs). Actually divergent data is already rare, 
and is probably going to become even more rare in the future, so in 
terms of data frequency, I hope we'll be converging.

> Do you think this approach might allow the IRI document
> to move forward and let the applicability discussions
> continue in more appropriate contexts?
>
> I will try to make this approach explicit in the IRI document.

In RFC 3987, this is currently in Section 1.2, Applicability, which starts:

    IRIs are designed to be compatible with recommendations for new URI
    schemes [RFC2718].  The compatibility is provided by specifying a
    well-defined and deterministic mapping from the IRI character
    sequence to the functionally equivalent URI character sequence.
    Practical use of IRIs (or IRI references) in place of URIs (or URI
    references) depends on the following conditions being met:

    a.  A protocol or format element should be explicitly designated to
        be able to carry IRIs.  The intent is not to introduce IRIs into
        contexts that are not defined to accept them.  For example, XML
        schema [XMLSchema] has an explicit type "anyURI" that includes
        IRIs and IRI references. Therefore, IRIs and IRI references can
        be in attributes and elements of type "anyURI".  On the other
        hand, in the HTTP protocol [RFC2616], the Request URI is defined
        as a URI, which means that direct use of IRIs is not allowed in
        HTTP requests.

(see http://tools.ietf.org/html/rfc3987#section-1.2). Is that about what 
everybody has in mind? If not, what should be changed? Comments appreciated!

(John, for your reference, part of the reason this section is the way it 
is is due to input from Ted and Leslie.)

Regards,    Martin.

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp

Received on Monday, 22 June 2009 08:36:09 UTC