W3C home > Mailing lists > Public > public-iri@w3.org > July 2012

Re: [iri] #131: Using document charset causes interoperability problems

From: Peter Saint-Andre <stpeter@stpeter.im>
Date: Sat, 21 Jul 2012 15:37:03 -0600
Message-ID: <500B20FF.90100@stpeter.im>
To: Larry Masinter <masinter@adobe.com>
CC: "draft-ietf-iri-3987bis@tools.ietf.org" <draft-ietf-iri-3987bis@tools.ietf.org>, "public-iri@w3.org" <public-iri@w3.org>
<hat type='individual'/>

On 7/21/12 10:06 AM, Larry Masinter wrote:
> I hate this feature, and would love to get rid of it, but let's acknowledge at least somewhere that it happens. That is, the interoperability problems are real, but not documenting it here doesn't solve the problem.
> I think what the text in the document intended was that whether there _was_ a "document charset" at all depended on the format of the document... yes, for HTML, maybe for Word (up to word), no for PDF, maybe (not yet defined) for text/plain.
> I can see two choices that might work:
> * Any document format that wishes this kind of processing has to say that what they are using aren't really IRIs, they're funny strings that get preprocessed to turn them into IRIs or URIs.
> * The IRI spec (continues to) explicitly defines this document-charset-dependent behavior, but is more explicit about the rules for where "document charset" comes from.
> I could go with either one of those. How do those seem to the group?

In the interest of calling a spade a spade, I'd be in favor of the first
option: they're not really IRIs, but they can be turned into IRIs.


Peter Saint-Andre
Received on Saturday, 21 July 2012 21:37:33 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:39:44 UTC