- From: <bugzilla@wiggum.w3.org>
- Date: Thu, 22 Mar 2007 09:12:11 +0000
- To: www-validator-cvs@w3.org
- CC:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=978 ot@w3.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |ASSIGNED Component|Parser |check Summary|errors in XMLPI make openSP |systematic xml preparse mode |output errors beyond |triggers wrong parse mode |document boundaries |for xml documents with | |broken xml declaration ------- Comment #5 from ot@w3.org 2007-03-22 09:12 ------- http://qa-dev.w3.org/wmvs/HEAD/check?uri=http%3A%2F%2Fqa-dev.w3.org%2Fwmvs%2FHEAD%2Fdev%2Ftests%2Fbogus-xmlpi.html;debug is useful in understanding what's happening. * an XHTML document is sent as text/html (curse the day text/html was said to be OK for XHTML...) * the parse mode is set to TBD * preparse looks at document - by default HTML::Parser was set to XML mode - pre-parsing cannot find end of XML declaration, and thus parses the whole doc as if... - the doctype cannot be found * as a result, XML mode is NOT triggered * openSP is launched in SGML mode * openSP parses the XML DTD as an SGML DTD, whines * errors are reported in the DTD (which is why it looks as though it reports errors in the document, but at odd lines). FIX: use pre-parser as XML mode only if the content-type has unambiguously shown that we should do so. In the case of text/html, cautiously use SGML pre-parsing. Finding an XHTML document type will later trigger xml mode in the actual parser and validator. [[ my $p = HTML::Parser->new(api_version => 3); - $p->xml_mode(TRUE); + # if content-type has shown we should pre-parse with XML mode, use that + # otherwise (mostly text/html cases) use default mode + $p->xml_mode(TRUE) if ($File->{Mode} eq 'XML'); ]] I have to test this patch against a number of other test cases, but I'm hopeful it should be the solution to this problem, as well as Bug #14.
Received on Thursday, 22 March 2007 14:05:25 UTC