- From: Sean B. Palmer <sean@miscoranda.com>
- Date: Mon, 17 Dec 2007 15:51:29 +0000
- To: "Garret Wilson" <garret@globalmentor.com>
- Cc: "Eric Prud'hommeaux" <eric@w3.org>, ietf-types@iana.org, "Tim Berners-Lee" <timbl@w3.org>, "Daniel W. Connolly" <connolly@w3.org>, "Dave Beckett" <dave@dajobe.org>, "Lee Feigenbaum" <lee@thefigtrees.net>, "Graham Klyne" <GK@ninebynine.org>, "Dan Brickley" <danbri@danbri.org>, www-archive@w3.org
On Dec 17, 2007 3:22 PM, Garret Wilson <garret@globalmentor.com> wrote: > There exists serious concern regarding the use of a text top-level > type for N3. See the recent discussion on www-rdf-comments. Eric and I discussed that in some detail prior to and subsequent to the start of this thread. One thing that I don't understand is what you said here: [[[ I suppose that, with the popular understanding that RFC 2046 requires a default character set of US-ASCII if there is no charset parameter, then it's almost as true as if RFC 2046 said so explicitly. ]]] - http://lists.w3.org/Archives/Public/www-rdf-comments/2007OctDec/0017 It seems very clear to me that RFC 2046 states explicitly that US-ASCII is required if there is no charset parameter. Here are the relevant quotes: The default character set, which must be assumed in the absence of a charset parameter, is US-ASCII. ... Note that the character set used, if anything other than US- ASCII, must always be explicitly specified in the Content-Type field. The way I read that, that doesn't leave any room for a text/anything specification setting its own default. As for the CRLF requirement, that CRLF and *only* CRLF be used for line breaks, Dan Brickley commented in response to that that text/xml was widely regarded troublesome; but it's not clear from his citations that CRLF has anything to do with the troublesome nature, only charset defaulting. It seems that most of the problem, as you mentioned in the www-rdf-comments thread, is that the text subtree is simply broken. RFC 2046 just wasn't written to deal with the Unicode world. Check out the following, for example: A SINGLE character set that can be used universally for representing all of the world's languages in Internet mail would be preferrable. Unfortunately, existing practice in several communities seems to point to the continued use of multiple character sets in the near future. A small number of standard character sets are, therefore, defined for Internet use in this document. And it defines US-ASCII and ISO-8859-X. It's not RFC 2046's fault that it wasn't prescient, but it's *out-of-date* now and perhaps ought to be obsoleted so that text/* can be used as intended rather than as we're currently forced? But of course there is the question of what MIME implementations will do and what problems, possibly serious, it would cause to, for example, make utf-8 the new text/* default. It would need a lot of discussion and a new RFC. Note that TimBL has never, as far as I know, suggested disregarding the charset defaulting requirement, just the CRLF requirement which he mightn't even be aware of. And as it seems that the charset defaulting is the thing that most people are anxious about, I'd be happy for text/rdf+n3; charset=utf-8 or text/n3; charset=utf-8 to go forwards, even ignoring the fact that it disregards the CRLF requirement. Thanks, -- Sean B. Palmer, http://inamidst.com/sbp/
Received on Monday, 17 December 2007 15:51:40 UTC