W3C home > Mailing lists > Public > www-validator@w3.org > September 2006

Re: Unknown Parse Mode! in form submission

From: Lachlan Hunt <lachlan.hunt@lachy.id.au>
Date: Tue, 05 Sep 2006 11:19:35 +1000
Message-ID: <44FCD0A7.7080204@lachy.id.au>
To: Frank Ellermann <nobody@xyzzy.claranet.de>
CC: www-validator@w3.org

Frank Ellermann wrote:
> Lachlan Hunt wrote:
>> The problem is that the validator doesn't use the XML
>> declaration as its trigger to use XML mode.
> Is that a bug ?  If a document says <?xml version=" (etc.)
> at the begin, maybe after one of the various signatures
> for UTF-8, BOCU-1, etc., then it's supposed to be XML, or
> isn't it ?

Well, technically, no. <?xml> is a valid SGML PI.  However, in reality, 
nobody uses PIs in HTML, particularly not ones that look so the same as 
the XML declaration without intending it to be that.

So the validator could easily get away with using that as a trigger. 
But if it does so, it should inform the user that the presence of the 
XML declaration has triggered XML parsing mode.

>> It uses the DOCTYPE and switches to XML mode for known XML
>> DOCTYPEs.  In this case, the WAPFORUM XHTML DOCTYPE is
>> unknown.
> Okay, then the fastest fix might be to use SYSTEM instead of

Why would that work?  Using SYSTEM instead of PUBLIC doesn't provide any 
further indication about whether the document is XML or SGML.

>>> Manipulating the content-type in your meta doesn't help
>> Of course not, the meta element is absolutely useless for
>> specifying anything but the character encoding, and then
>> only for HTML, never for XML.
> Apparently it works also for XHTML to some degree, the error
> message is wrong if encoding="US-ASCII" and charset="UTF-8"
> are different.

It shouldn't.  When XHTML is treated as XML, XML rules apply and 
character encoding is determined roughly like this:

1. Protocol (e.g. Content-Type HTTP header)
2. XML Declaration
3. Byte Order Mark (must be UTF-8 or UTF-16)

The meta element is not defined for use in XML.

Lachlan Hunt
Received on Tuesday, 5 September 2006 01:20:21 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:58:58 UTC