- From: Terje Bless <link@pobox.com>
- Date: Sat, 22 May 2004 21:49:24 +0200
- To: W3C Validator <www-validator@w3.org>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jukka K. Korpela <jkorpela@cs.tut.fi> wrote: >On Sat, 22 May 2004, Terje Bless wrote: > >>If possible, we would appreciate it if you could resubmit your document >>for processing and let us know whether this has resolved the issue you >>reported. > >The issue was resolved for the test case related to adding attributes >into the Frameset DTD. The test case now validates. Thank you. >But for http://www.cs.tut.fi/~jkorpela/html/nobr.html there's still a >problem, though a different one. Apparently the validator now recognizes >the customized DTD, but it just reports that the page is not valid, >followed by "Below are the results of attempting to parse this document >with an SGML parser." followed by no results. (The page _is_ valid.) Well, whether the page is considered Valid probably depends on how you choose to apply the term in this context (see below). But either way, it's a highly unfortunate behaviour in the Validator to give the "Invalid" result page without actually listing any errors though. Thanks for reporting this! >I remember having seen such a situation earlier, but I think the problem >was fixed, and now it seems to have re-emerged. That may have been with the WDG HTML Validator (see below). >I think there used to the problem that if I add new elements into the >definition of %phrase (as I do, by introducing NOBR), some internal >limit (GRPCNT, I think) in the validator prevented validation - >presumably it was unable to process the DTD. And I vaguely remember this >caused a situation like the above. But when I now tried removing >ACRONYM, thereby making the number of elements the same as in HTML 4.01 >DTD, it did not help. When I removed both ACRONYM and DFN, validation >was successful. I'm puzzled. This is in fact exactly what is going on. Why removing ACRONYM didn't work isn't yet clear to me — possibly it's because the exceeded limit is in a different place than what one might initially expect, making the different elements unequal in this regard — but the error triggered by your modified DTD is exceeding the GRPCNT. The Validator isn't reporting this because we supress errors located in external entities (e.g. the External Subset)[0][1]. This is arguably the correct behaviour as that value is set in the SGML Declaration and HTML 4.01 has a (unfortunately, rather implicit) fixed SGML Declaration. In theory you can override this with a SGMLDECL Declaration (from the WebSGML Annex to SGML), but I wouldn't recommend it and I suspect the Validator would not handle this very well. The WDG HTML Validator (and, probably, Page Valet), IIRC, uses a modified SGML Declaration that extends these limits somewhat; and I think Liam Quinn once sent us the values he's used. Unfortunately I didn't have time at that point to really investigate it and I was undecided on whether using a modified SGML Declaration was the correct thing to do. The W3C Markup Validator has always used the SGML Declaration from the HTML 4.01 Recommendation[2], so it's unlikely that document has ever passed there. I'll look into the details of this issue, but as mentioned I'm uncertain as to the correct course of action here. I would appreciate comments and opinions on this and ways to address the issue. [0] - Partially this is due to some spurious messages emitted by the SGML Parser for dubious — but not invalid — constructs in some W3C DTDs; and partially it's a conscious design decision related to the fact that the Validator is focussing on checking the part of documents authored by end users and not the DTDs. The latter because the majority of DTDs are assumed to have been written by people capable of manually checking the DTD with an SGML Parser directly. [1] - There are three very usefull debugging options that we use during development that will reveal issues such as this. They are ";debug=1" to enable debugging output, ";esis=1" to show the raw ESIS output from onsgmls, and ";errors=1" to show the raw error (stderr) output of onsgmls. By appending these options to the CGI URL you will get the associated option's output. [2] - This has been virtually unchanged for all HTML Recommendations published by the W3C, and the Validator has always used the same one (modulo some Document Charset issues). - -- "If at first you don't succeed, keep shooting." -- monk -----BEGIN PGP SIGNATURE----- Version: PGP SDK 3.0.3 iQA/AwUBQK+uwqPyPrIkdfXsEQK8WgCfYoPKAJh0hNY5u0mI3vm7+tFkSUUAoKZj izJZlQ3nUWvrjAL+kHq6y4+B =9b/x -----END PGP SIGNATURE-----
Received on Saturday, 22 May 2004 15:49:28 UTC