- From: Marc Richards <contact_marcos@yahoo.es>
- Date: Mon, 11 Jul 2005 11:28:15 -0400
- To: www-validator@w3.org
Hi, I have been doing some research about the need for CDATA sections in xhtml pages and I have run into a couple of areas that are a bit unclear so I was hoping some of the validator folks might have some answers. I am already aware of certain w3c validator issues like warnings being accidentally suppressed[1] and the fact that the validator "has some limitations" with regards to CML. In general I have done all my testing[2] against the development version of the validator[3]. So here are my questions: 1) Should the validator be throwing an error instead of a warning whenever it encounters an ampersand or left angle bracket as data for a document served as application/xhtml+xml? i.e. was there a conscious decision made to only throw a warning or is this simply one of the XML parser limitations. If this *is* one of the XML limitations then I think it would be helpful to compile a short list of common limitations and list them on a w3c page in plain English. I have read the OpenSP page[4] a couple times and I am still not sure whether or not recognizing "<" and "&" as invalid is a limitation of the parser; The language on that page is fairly technical. The validator could link to this internal page directly and that page would then link to the OpenSP page as well. If this isn't a parser limitation, is there a bug number open for recognizing these two characters as errors? I would like to add myself to the CC list. 2) Why are you issuing a warning for the use of ampersands and let angle brackets in xhtml but not html. If the warning is in fact saying "this may be valid in some contexts, but it is recommended to use & or <" then this is an SGML warning and should be shown for both HTML and XHTML as text/html. Ideally with and example like "R & D valid, R&D invalid". Is there a bug open for issuing the warning for html doctypes as well? Here is my current understanding on the validity of various doctypes/content types along with how I think the validator should react. They should correspond to my validation test cases[2] ampersand and left bracket inside the body as data. --------------------------------------------------- HTML4 - warning: you are a few whitespaces away form an invalid page XHTML1 as HTML - warning: you are a few white spaces away form an invalid page XHTML1 as XML - error: this is xml fool! ampersand and left bracket inside the body as data. oops, no spaces. -------------------------------------------------------------------- HTML4 - error: unrecognized entity and unrecognized tag XHTML1 as HTML - error: unrecognized entity and unrecognized tag XHTML1 as XML - error: unrecognized entity and unrecognized tag ampersand and left bracket inside the script tag as data. --------------------------------------------------------- HTML4 - no problem: the script tag is CDATA in the HTML4 DTD XHTML1 as HTML - warning: you are a few whitspaces away form an invalid page. the script tag is PCDATA in the XHTML1 DTD XHTML1 as XML - error: this is xml fool! ampersand and left bracket inside the script tag as data. oops, no spaces. -------------------------------------------------------------------- HTML4 - no problem: the script tag is CDATA in the HTML4 DTD XHTML1 as HTML - error: unrecognized entity and unrecognized tag XHTML1 as XML - error: unrecognized entity and unrecognized tag ampersand and left bracket inside the script tag as data w/ the CDATA tag ---------------------------------------------------------------------- HTML4 - no problem: no harm in a redundant CDATA section XHTML1 as HTML - no problem: CDATA section to the rescue XHTML1 as XML - no problem: CDATA section to the rescue ampersand and left bracket inside the script tag as data w/ the CDATA tag. oops, no spaces. ------------------------------------------------------------------------ HTML4 - no problem: no harm in a redundant CDATA section XHTML1 as HTML - no problem: CDATA section to the rescue XHTML1 as XML - no problem: CDATA section to the rescue Are these scenarios correct/ideal? Are there open bugs you can point me to? Are there bugs I should file? Thanks. Marc [1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=798 [2] http://mulberry.swarthmore.edu/validation-tests/ [3] http://validator.w3.org:8001/ [4] http://openjade.sourceforge.net/doc/xml.htm
Received on Monday, 11 July 2005 15:28:19 UTC