[Bug 3626] XHTML detection relies only on namespace declaration

http://www.w3.org/Bugs/Public/show_bug.cgi?id=3626

           Summary: XHTML detection relies only on namespace declaration
           Product: CSSValidator
           Version: CSS Validator
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XHTML1.0
        AssignedTo: ot@w3.org
        ReportedBy: Christoph@Schneegans.de
         QAContact: www-validator-cvs@w3.org


In order to extract CSS rules and declarations from HTML/XHTML documents, the 
CSS Validator needs to parse these documents. Therefore, it needs to determine 
whether a document is HTML or XHTML.

The method currently in use is not very sophisticated; the CSS Validator only 
looks for an XHTML namespace declaration. In particular, it ignores XML 
declarations and XHTML document type declarations. However, the presence of 
these declarations is a very reliable indicator for XHTML, so the CSS Validator 
can safely parse the document as such.

<http://jigsaw.w3.org/css-validator/validator?uri=http://schneegans.de/temp/no-xmlns.html>
does not use an XML parser. Otherwise, the well-formedness violation would be 
detected.

Furthermore, the CSS Validator fails to detect a namespace declaration when it 
is preceded by too many characters. Again,
<http://jigsaw.w3.org/css-validator/validator?uri=http://schneegans.de/temp/late-xmlns.html>
does not use an XML parser.

Parsing XHTML documents as HTML may incorrectly throw errors, e.g.
<http://schneegans.de/temp/space-preserve-no-namespace.html> is valid XHTML and 
conforms to Appendix C guidelines, although
<http://jigsaw.w3.org/css-validator/validator?uri=http://schneegans.de/temp/space-preserve-no-namespace.html>
complains about the "xml:space" attribute.

Received on Friday, 25 August 2006 18:48:28 UTC