- From: Frank Ellermann <nobody@xyzzy.claranet.de>
- Date: Mon, 28 Apr 2008 11:13:12 +0200
- To: www-validator@w3.org
olivier Thereaux wrote: > the feedback requested was mostly on documents without > declared character encoding I can't help you there, I have no Latin-1 files, Andreas posted a test URL. > error is fixed. 5279 and 5280 are untouched. FWIW I still think that these are bugs or oddities in the XML 1.0 3rd and 4th editions, and not in the validator. >> [warning] Unable to Determine Parse Mode! >> [...] >> | Type (-//IETF//DTD HTML i18n//EN) is not in the validator's catalog >> <http://freenet-homepage.de/Xyzzy/home/test/res.html> >> (SGML is correct - RFC 2070 DTD republished by IANA) > It does validate right now. I guess you are pointing out that > “ Unable to Determine Parse Mode” could use a better wording > in this case? No, the wording is fine, but what happens is suboptimal: * The DTD should be added to the catalog, then SGML is clear. * At the moment the default parse mode is SGML, that happens to be okay for this document. But when you say that UTF-8 is the future you can as well say that SGML is the past: * HTML5 will do its own thing, the rest of the world uses XML, and SGML is doomed, ignoring billions of HTML < 5 documents. >> [warning] Missing "charset" attribute for "text/xml" document. >> <http://freenet-homepage.de/Xyzzy/home/test/utf-4.xml> >> (this text/xml document really uses encoding US-ASCII) >| HEAD http://freenet-homepage.de/Xyzzy/home/test/utf-4.xml >| Content-Type: text/xml > So I guess the warning by the validator, that the spec > specifies a strong default of "us-ascii" is OK here? IMO it is odd, it warns about using a "strong default" ASCII, the document in fact starts with <?xml encoding="us-ascii" ?>, it turns out to be ASCII, and if it would use a single octet above 0x7F it would cause a real error message. The warning is apparently pointless, can the validator output an "info" ? As "info: assuming ASCII", or "info: using SGML" (see above), the case could be clearer. I'm used to "a warning is always bad news", using compiler option "pedantic". About a decade ago the *NIX style was "no news is good news". >> [warning] Mismatch between Public and System identifiers [...] >> (the released validator has no problem with using System >> identifiers pointing to its own catatlog, maybe it's an >> artefact of the qa-dev.w3.org != validator.w3.org setup) > That's a new feature. Some recent feedback prompted the > addition of a check for consistency between FPI and SI. > It's a warning, so as usual, it can be ignored if you are > sure of your doctype. I'll never ever ignore warnings. They can cost months when porting code from compiler A on platform B to compiler C on platform D. A warning is a serious thing like a SHOULD in an RFC (and an error is like a MUST meaning "you are dead"). When the WDG validator says "warning" it means "this will break each and every browser, even if it could be in a very formal and theoretical sense 'valid' (for Amaya, not ITW)". >> [warning] Character Encoding mismatch! >> | The character encoding specified in the HTTP header >> | (iso-8859-1) is different from the value in the <meta> >> | element (windows-1252). [...] >> (Nikita consistently hates u+0080 based on an iso-8859-1 >> assumption, and the document uses a windows-1252 0x80 €) > Mm, sorry, not sure if you are reporting an issue or a > "work as it should, here". Can you give more details? If the validator really assumes iso-8859-1, then 0x80 is u+0080, not a valid SGML character, as reported by Nikita. I miss the error message. As soon as this error message shows up I could again say that the assumption is already wrong, of course a windows-1252 0x80 is okay... Frank
Received on Monday, 28 April 2008 09:11:15 UTC