- From: Terje Bless <link@pobox.com>
- Date: Fri, 29 Aug 2003 13:10:12 +0200
- To: W3C Validator <www-validator@w3.org>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jukka K. Korpela <jkorpela@cs.tut.fi> wrote: >On Thu, 28 Aug 2003, Terje Bless wrote: >>-----BEGIN PGP SIGNED MESSAGE----- > >Please don't. You gain nothing but confusion among some people. Oh? I have some specific reasons for PGP signing my messages, but perhaps if you elaborated a bit on what you mean I would have reason to reconsider? >><http://validator.w3.org:8001/> > >It claims that my main page http://www.cs.tut.fi/~jkorpela/ is not >valid, claiming that h1 is not allowed without a preceding <body>. > >It also claims that some end tags must not be omitted because "OMITTAG >NO was specified". > >This is undescribably absurd. It ain't no 1st of April! Well, as others have explained, you are seeing the effects of the new «Fussy Parsing» mode. Given I know you are familiar with the concepts I'll take it your complaint is rather that it was misleading in the test case you attempted; correct? In response, let me summarize the points brought forward by Nick, Jim, and Olivier: * The new «Fussy Parsing» mode is an optional additional check. It is the beginnings of adding back the equivalent of the «Weblint» option that used to be avilable, but which had to be removed because it was too badly out of touch with reality. * That the «Fussy Parsing» is enabled quietly and by default when submitting an URL from the front page is a mistake (a bug). Note that this only affects URLs submitted through the form on the front page of the Beta; «/check?uri=referer» is not affected and on the «Advanced Form» the option is visible and can be disabled. * At some suitable point during the beta test — most likely coinciding with other updates and bug fixes — the relevant checkbox will be made visible, but still enabled by default to garner feedback and wider testing of this new feature. By the time we go to final release, it is likely that it will be disabled by default and possibly relegated to only the «Advanced Form». This is exactly what beta testing is for; to determine the exact disposition of this type of change. * The presentation of the results when «Fussy Parsing» is enabled appear to not be satisfactory. This will be rectified to the extent possible before release. If you have suggestions for how to improve it they would be most welcome. And as final note, the «Fussy Parsing» did *exactly* what it was intended to do with your page! It told you that you had omitted the start and end tags for the «body» element. I'm sure you were aware of this and had quite possibly omitted them on purpose, but then you're hardly the target of this new feature. As the release notes said; it's not a parse mode for fussy people, it's a parse mode that is fussy so that you don't have to be! It is specifically intended to make the validator more usefull for the general population of users, but _without_ compromising its objectivity. The «Fussy Parsing» mode is implemented by fiddling with the effective SGML Declaration on the fly — as opposed to some regex or heuristic hackery — to account for the fact that the original SGML Declaration is badly out of touch with common implementations. If anyone should care to dig deeper into this, the «Fussy Parsing» mode is implemented by passing additional warning options to OpenSP, our SGML Parser. The specific options in use are: * unclosed — Warn about unclosed start and end-tags. * empty — Warn about empty start and end-tags. * net — Warn about net-enabling start-tags and null end-tags. * refc — Warn about ommitted refc delimiters. * data-delim — Warn about occurances of `<' and `&' as data. * missing-att-name Warn about ommitted attribute names in start tags. * fully-tagged Warn if the document instance fails to be fully-tagged. This has the effect of changing the SGML declaration to specify DATATAG NO, RANK NO, OMITTAG NO, SHORTTAG STARTTAG EMPTY NO and SHORTTAG ATTRIB OMITNAME NO. In addition, the following options are always enabled for SGML documents: * valid Has the effect of changing the SGML declaration to specify VALIDITY TYPE and IMPLYDEF ATTLIST NO ELEMENT NO ENTITY NO NOTATION NO. * non-sgml-char-ref Warn about numeric character references to non-SGML characters. * no-duplicate Do not warn about duplicate entity declarations. I would be happy to discuss the details of each of these option, when and how to enable them, as well as any other options we should add support for. I am also open to discussion of whether there should be more fine-grained selection possible between all of the optional switches or none of them. And as allready mentioned, the user interface to these options and their presentation is still up for discussion. >>[…] — «Who the heck *writes* this stuff?!?!» — […] > >Sorry, it seems that some virus inserted garbage text into your message. Oh? Are you refering to the double quote marks and the em-dash? As far as I know, those should be sent as properly encoded UTF-8. Please let me know if they were not. Could it be that your email client does not support UTF-8? >>When the W3C Markup Validator is running in «Fussy Parsing» mode it >>will complain about all sorts of things that are technically legal in >>HTML, but which is known to be problematic in practice and probably not >>what you wanted. > >It's not just "complaining". It's claiming that a valid document is >invalid. And this is apparently intentional. Hence, it does not even try >to be a validator any more. That does not follow; neither from the argument nor from the observed behaviour. Please let us know what we can do to _improve_ the situation — and I assure you, we care very deeply about making the validator as useful a tool as it possibly can be! — instead of assuming we are deliberatly trying to sabotage you (well, or using that apparent assumption as a rethorical device in any case). - -- Of course we are the good guys! We define what is good and evil. All other definitions are wrong, and possibly the product of a deranged imagination. -- Stephen Harris -----BEGIN PGP SIGNATURE----- Version: PGP SDK 3.0.2 iQA/AwUBP080k6PyPrIkdfXsEQKpjACeMSX1XVlKeU8yoivZNo4QptWrIbgAoKNb 2JHLdJiUpMBieHdngvcqj8Tk =cpLY -----END PGP SIGNATURE-----
Received on Friday, 29 August 2003 07:10:15 UTC