- From: <bugzilla@wiggum.w3.org>
- Date: Thu, 22 Mar 2007 09:12:11 +0000
- To: www-validator-cvs@w3.org
- CC:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=978
ot@w3.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |ASSIGNED
Component|Parser |check
Summary|errors in XMLPI make openSP |systematic xml preparse mode
|output errors beyond |triggers wrong parse mode
|document boundaries |for xml documents with
| |broken xml declaration
------- Comment #5 from ot@w3.org 2007-03-22 09:12 -------
http://qa-dev.w3.org/wmvs/HEAD/check?uri=http%3A%2F%2Fqa-dev.w3.org%2Fwmvs%2FHEAD%2Fdev%2Ftests%2Fbogus-xmlpi.html;debug
is useful in understanding what's happening.
* an XHTML document is sent as text/html (curse the day text/html was said to
be OK for XHTML...)
* the parse mode is set to TBD
* preparse looks at document
- by default HTML::Parser was set to XML mode
- pre-parsing cannot find end of XML declaration, and thus parses the whole
doc as if...
- the doctype cannot be found
* as a result, XML mode is NOT triggered
* openSP is launched in SGML mode
* openSP parses the XML DTD as an SGML DTD, whines
* errors are reported in the DTD (which is why it looks as though it reports
errors in the document, but at odd lines).
FIX: use pre-parser as XML mode only if the content-type has unambiguously
shown that we should do so.
In the case of text/html, cautiously use SGML pre-parsing. Finding an XHTML
document type will later trigger xml mode in the actual parser and validator.
[[
my $p = HTML::Parser->new(api_version => 3);
- $p->xml_mode(TRUE);
+ # if content-type has shown we should pre-parse with XML mode, use that
+ # otherwise (mostly text/html cases) use default mode
+ $p->xml_mode(TRUE) if ($File->{Mode} eq 'XML');
]]
I have to test this patch against a number of other test cases, but I'm hopeful
it should be the solution to this problem, as well as Bug #14.
Received on Thursday, 22 March 2007 14:05:25 UTC