CDATA and HTML Compatibility was: escaping escaping) from David Woolley on 2002-12-09 (www-html@w3.org from December 2002)

From: David Woolley <david@djwhome.demon.co.uk>
Date: Mon, 9 Dec 2002 07:17:22 +0000 (GMT)
To: www-html@w3.org
Message-Id: <200212090717.gB97HM804298@djwhome.demon.co.uk>

> 
> Pardon? Since the Validator no longer performs well-formedness checking
> without also Validating, that would mean this is a bug in the Validator.

It looks like it is legal, in which case I would say there was a problem
with the non-normative compatibility part of the XHTML 1.0 specification
in that it fails to say that the use of CDATA sections is incompatible
with HTML 4; this is only hinted at in that people are advised to avoid
contexts where this would be forced for script and style, by, for example,
using out of line code.

However, here is an example of invalid XHTML 1.0 code that the W3C validator,
like any DTD driven one, accepts (slightly modified from an example
in the reccommendation):

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
     PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
  <head>
    <title>Virtual Library</title>
  </head>
  <body>
    <p>Moved to <a href="http://example.org/">example.org
<em><a href="#nested"></a></em></a>.</p>
  </body>
</html>


> >You can't of course, represent a CDATA section in this way,
> 
> YM a CDATA section inside a CDATA section?

They are explicitly not allowed to nest; the first terminator will
terminate the whole nest.  (Didn't actually decipher "YM".)

> >and you will still have gone through a translation from transfer
> >character set to UCS-4 and from UCS-4 to the display font encoding.
> 
> The relevance being?

That the XHTML processor is not dealing with the original source document,
but with a transformed version; I feel that the original was taking a
view that XMP was a complete literal pass through, which it probably 
was in the browsers that originally implemented it, as they would almost
certainly have had no character set processing.

Received on Monday, 9 December 2002 02:53:01 UTC