Re: Re: Request for review of Turtle (an RDF serialization) media type: text/turtle

* Julian Reschke <julian.reschke@gmx.de> [2007-12-18 15:24+0100]
> Eric Prud'hommeaux wrote:
>> W3C is about to publish a Team Submission for the RDF serialization
>> Turtle. A mockup of the document to be published is at
>>   http://www.w3.org/2007/11/21-turtle
>>
>> Because the document will include the text of the media type
>> registration, I am vetting this registration with ietf-types before
>> publishing the document. Some discussion about the claim to force
>> utf-8 encoding (and not require that in a charset parameter) can be
>> seen at http://lists.w3.org/Archives/Public/www-archive/2007Dec/
>> (Subject: Media types for RDF languages N3 and Turtle)
>> I got moderator-actioned for having too many folks in the Cc so
>> I'm Bcc'ing them all in this request for review:
>> ...
>
> 1) If text/* proves to be problematic, why not use application/*?

Turtle is a form of RDF that was designed to be specifically
human-readable. It is unlikely that RDF will ever have a more texty
expression.

2046 §4.1. Text Media Type ¶2:
[[
   Beyond plain text, there are many formats for representing what might
   be known as "rich text".  An interesting characteristic of many such
   representations is that they are to some extent readable even without
   the software that interprets them.  It is useful, then, to
   distinguish them, at the highest level, from such unreadable data as
   images, audio, or text represented in an unreadable form. In the
   absence of appropriate interpretation software, it is reasonable to
   show subtypes of "text" to the user, while it is not reasonable to do
   so with most nontextual data. Such formatted textual data should be
   represented using subtypes of "text".
]]

Yes, text/* is problematic, but if we figure out what it can and can't
do, they the institutional knowldege will hopefully transfer to the
next poor sucker who tries to register a non-ascii language in text/ .

Ceretainly, I'm certainly willing to give up and fall back to
application/ , but I worry that no modern languages are appropriate
for text/ and that we are hostage to a legacy exactly counter to the
intent of the tree.


> 2) Also, keep in mind that while RFC2046 may be interpreted not to mandate 
> the ASCII default for text types other than text/plain, there's also 
> RFC2616 saying...:
>
>    The "charset" parameter is used with some media types to define the
>    character set (section 3.4) of the data. When no explicit charset
>    parameter is provided by the sender, media subtypes of the "text"
>    type are defined to have a default charset value of "ISO-8859-1" when
>    received via HTTP. Data in character sets other than "ISO-8859-1" or
>    its subsets MUST be labeled with an appropriate charset value. See
>    section 3.4.1 for compatibility problems. -- 
> <http://tools.ietf.org/html/rfc2616#section-3.7.1>

(http://www.w3.org/Protocols/rfc2616/rfc2616-sec3#sec3.4.1 explains the
 motivation of this rather painful rule; current practice of some old
 and broken HTTP/1.0 implementations.)

This asserts extra constraints on participants in HTTP transactions
above and beyond those of othter agents exchanging MIME e.g. MTAs and
MUAs. If the media type for turtle was as I wrote, 2616 would say that
web servers would still have to supply the charset=UTF-8 parameter.
When that deference to legacy gets obsolesced, the media type will not
need updating. This reduces our choices to comparing the relative
costs of:
  1. failure to present a fairly intelligible form to the consumer.
  2. must include charset in HTTP for the foreseeable future. 


> See also the related HTTPbis issue: 
> <http://www.w3.org/Protocols/HTTP/1.1/rfc2616bis/issues/#i20>
>
> BR, Julian

-- 
-eric

office: +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
mobile: +1.617.599.3509

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Tuesday, 18 December 2007 17:48:59 UTC