- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Tue, 18 Dec 2007 12:48:22 -0500
- To: Julian Reschke <julian.reschke@gmx.de>
- Cc: ietf-types@alvestrand.no, www-archive@w3.org
- Message-ID: <20071218174822.GT8244@w3.org>
* Julian Reschke <julian.reschke@gmx.de> [2007-12-18 15:24+0100] > Eric Prud'hommeaux wrote: >> W3C is about to publish a Team Submission for the RDF serialization >> Turtle. A mockup of the document to be published is at >> http://www.w3.org/2007/11/21-turtle >> >> Because the document will include the text of the media type >> registration, I am vetting this registration with ietf-types before >> publishing the document. Some discussion about the claim to force >> utf-8 encoding (and not require that in a charset parameter) can be >> seen at http://lists.w3.org/Archives/Public/www-archive/2007Dec/ >> (Subject: Media types for RDF languages N3 and Turtle) >> I got moderator-actioned for having too many folks in the Cc so >> I'm Bcc'ing them all in this request for review: >> ... > > 1) If text/* proves to be problematic, why not use application/*? Turtle is a form of RDF that was designed to be specifically human-readable. It is unlikely that RDF will ever have a more texty expression. 2046 §4.1. Text Media Type ¶2: [[ Beyond plain text, there are many formats for representing what might be known as "rich text". An interesting characteristic of many such representations is that they are to some extent readable even without the software that interprets them. It is useful, then, to distinguish them, at the highest level, from such unreadable data as images, audio, or text represented in an unreadable form. In the absence of appropriate interpretation software, it is reasonable to show subtypes of "text" to the user, while it is not reasonable to do so with most nontextual data. Such formatted textual data should be represented using subtypes of "text". ]] Yes, text/* is problematic, but if we figure out what it can and can't do, they the institutional knowldege will hopefully transfer to the next poor sucker who tries to register a non-ascii language in text/ . Ceretainly, I'm certainly willing to give up and fall back to application/ , but I worry that no modern languages are appropriate for text/ and that we are hostage to a legacy exactly counter to the intent of the tree. > 2) Also, keep in mind that while RFC2046 may be interpreted not to mandate > the ASCII default for text types other than text/plain, there's also > RFC2616 saying...: > > The "charset" parameter is used with some media types to define the > character set (section 3.4) of the data. When no explicit charset > parameter is provided by the sender, media subtypes of the "text" > type are defined to have a default charset value of "ISO-8859-1" when > received via HTTP. Data in character sets other than "ISO-8859-1" or > its subsets MUST be labeled with an appropriate charset value. See > section 3.4.1 for compatibility problems. -- > <http://tools.ietf.org/html/rfc2616#section-3.7.1> (http://www.w3.org/Protocols/rfc2616/rfc2616-sec3#sec3.4.1 explains the motivation of this rather painful rule; current practice of some old and broken HTTP/1.0 implementations.) This asserts extra constraints on participants in HTTP transactions above and beyond those of othter agents exchanging MIME e.g. MTAs and MUAs. If the media type for turtle was as I wrote, 2616 would say that web servers would still have to supply the charset=UTF-8 parameter. When that deference to legacy gets obsolesced, the media type will not need updating. This reduces our choices to comparing the relative costs of: 1. failure to present a fairly intelligible form to the consumer. 2. must include charset in HTTP for the foreseeable future. > See also the related HTTPbis issue: > <http://www.w3.org/Protocols/HTTP/1.1/rfc2616bis/issues/#i20> > > BR, Julian -- -eric office: +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA mobile: +1.617.599.3509 (eric@w3.org) Feel free to forward this message to any list for any purpose other than email address distribution.
Received on Tuesday, 18 December 2007 17:48:59 UTC