- From: Christian Hujer <Christian.Hujer@itcqis.com>
- Date: Tue, 17 Jan 2006 22:33:53 +0100
- To: www-html@w3.org
Hi, Am Dienstag, 17. Januar 2006 21:34 schrieb Garret Wilson: > Patrick H. Lauke wrote: > > Garret Wilson wrote: > >> The former is a fragment of a web page on XHTML > >> explaining how to use the <em> element, and actually displays the > >> literal string "<em>foo</em>". > > > > Should the < and > not be encoded as < and > in that particular > > case? > > No! If you have a text file (e.g. "instructions.txt" written with > notepad) containing the sentence, "this is how you use the emphasis tag: > <em>foo</em>", should < and > be encoded? No. In fact, look at the > source of this very email---you'll see that < and > are not encoded. > That's what plain text means. > > Now, the application when constructing the larger XHTML document *will* > have to encode those characters in the *destination* document. Those > characters will not be encoded in the *source* text/plain fragment, > though. (If the source fragment is an XHTML fragment, on the other hand, > those characters, if meant to be taken literally rather than interpreted > as markup, would need to be encoded.) > > Put another way, the effective text of the following two fragments are > identical: > > "<em>foo</em>" (content type: text/plain) > "<em>foo</em>" (content type: XHTML fragment) Yes. > (Nit-picky point: the two fragments above don't technically represent > identical content, because the former represents a string and the latter > represents an XML Text node, but they should result in identical text in > the destination document.) Well, even from an XML point of view they are the same. For text/plain I'd assume an XInclusion of parse=text, while for fragments I'd expect an an entity reference substitution or, if possible, an XInclusion of parse=xml. The resulting character sequence in the text node would actually be equal for the example. > It is impossible to represent the XHTML fragment "<em>foo</em>" in a > text/plain fragment, because plain text (naturally) has no concept of > syntactical structure. On the text/plain issue: text/plain should be used when the message body that was labelled as Content-Type: text/plain actually is meant to be interpreted as plain text. For an XHTML fragment that is served for further processing where it will be put together with other fragments to form a real XHTML document, text/plain is not appropriate. If the fragments are external entites in XML sense, application/xml-external-parsed-entity fits best, and application/octet-stream would still be much better than text/plain for that case. From an XInclude / Entity reference substitution point of view, I'd assume that text/plain is meant to be included with parse=text, while application/xml-external-parsed-entity is parse=xml and application/octet-stream is undetermined. Anyway, I'd say a general answer whether the characters need or need not to be encoded cannot be given. If the XHTML fragment is included as part of an XML document, encoding the characters that are part of the XHTML markup is not a good idea. When the XHTML is part of the XML markup of the containing document, XPath and XQuery can be used for the XHTML, if the XHTML markup is encoded, that's impossible. Also, if XHTML is contained as markup, schema or even dtd validation of the XHTML fragment is possible. Some more arguments for not encoding markup of XHTML fragments. -- Christian Hujer Free software developer E-Mail: Christian.Hujer@itcqis.com WWW: http://www.itcqis.com/ http://daimonin.sf.net/
Received on Tuesday, 17 January 2006 21:33:58 UTC