Re: content type for XHTML fragments: reformulated from Garret Wilson on 2006-01-17 (www-html@w3.org from January 2006)

From: Garret Wilson <garret@globalmentor.com>
Date: Tue, 17 Jan 2006 12:34:00 -0800
To: "Patrick H. Lauke" <redux@splintered.co.uk>
CC: www-html@w3.org
Message-ID: <43CD54B8.4000805@globalmentor.com>

Patrick H. Lauke wrote:
>
> Garret Wilson wrote:
>>  The former is a fragment of a web page on XHTML
>> explaining how to use the <em> element, and actually displays the 
>> literal string "<em>foo</em>".
>
> Should the < and > not be encoded as &lt; and &gt; in that particular 
> case?
>
No! If you have a text file (e.g. "instructions.txt" written with 
notepad) containing the sentence, "this is how you use the emphasis tag: 
<em>foo</em>", should < and > be encoded? No. In fact, look at the 
source of this very email---you'll see that < and > are not encoded. 
That's what plain text means.

Now, the application when constructing the larger XHTML document *will* 
have to encode those characters in the *destination* document. Those 
characters will not be encoded in the *source* text/plain fragment, 
though. (If the source fragment is an XHTML fragment, on the other hand, 
those characters, if meant to be taken literally rather than interpreted 
as markup, would need to be encoded.)

Put another way, the effective text of the following two fragments are 
identical:

"<em>foo</em>" (content type: text/plain)
"&lt;em&gt;foo&lt;/em&gt;" (content type: XHTML fragment)

(Nit-picky point: the two fragments above don't technically represent 
identical content, because the former represents a string and the latter 
represents an XML Text node, but they should result in identical text in 
the destination document.)

It is impossible to represent the XHTML fragment "<em>foo</em>" in a 
text/plain fragment, because plain text (naturally) has no concept of 
syntactical structure.

Garret

Received on Tuesday, 17 January 2006 20:34:21 UTC