Re: [iri] #131: Using document charset causes interoperability problems from Peter Saint-Andre on 2012-07-21 (public-iri@w3.org from July 2012)

From: Peter Saint-Andre <stpeter@stpeter.im>
Date: Sat, 21 Jul 2012 15:37:03 -0600
To: Larry Masinter <masinter@adobe.com>
CC: "draft-ietf-iri-3987bis@tools.ietf.org" <draft-ietf-iri-3987bis@tools.ietf.org>, "public-iri@w3.org" <public-iri@w3.org>
Message-ID: <500B20FF.90100@stpeter.im>

<hat type='individual'/>

On 7/21/12 10:06 AM, Larry Masinter wrote:
> I hate this feature, and would love to get rid of it, but let's acknowledge at least somewhere that it happens. That is, the interoperability problems are real, but not documenting it here doesn't solve the problem.
> 
> I think what the text in the document intended was that whether there _was_ a "document charset" at all depended on the format of the document... yes, for HTML, maybe for Word (up to word), no for PDF, maybe (not yet defined) for text/plain.
> 
> I can see two choices that might work:
> 
> * Any document format that wishes this kind of processing has to say that what they are using aren't really IRIs, they're funny strings that get preprocessed to turn them into IRIs or URIs.
> * The IRI spec (continues to) explicitly defines this document-charset-dependent behavior, but is more explicit about the rules for where "document charset" comes from.
> 
> I could go with either one of those. How do those seem to the group?

In the interest of calling a spade a spade, I'd be in favor of the first
option: they're not really IRIs, but they can be turned into IRIs.

Peter

-- 
Peter Saint-Andre
https://stpeter.im/

Received on Saturday, 21 July 2012 21:37:33 UTC