- From: Terje Bless <link@pobox.com>
- Date: Tue, 12 Nov 2002 00:24:30 +0100
- To: HTML Editor <www-html-editor@w3.org>
- Cc: W3C HTML Mailinglist <www-html@w3.org>, W3C Validator <www-validator@w3.org>
Mssrs., in attempting to modify the W3C MarkUp Validator to more reliably detect and report more forms of erroneous and invalid HTML, it's been brought and has come to my attention that the SGML Declaration included with the HTML 4.01 Recommendation appears to be at odds with both the prose of that Recommendation and the majority of User Agent implementations. The specific area of concern is from the FEATURES section; specifically the SHORTTAGS feature in the MINIMIZE section. The current SGML Declaration reads in part: .... FEATURES MINIMIZE SHORTTAGS YES .... This allows many things that are not sanctioned by the prose of the HTML 4.01 Recommendation, are not implemented by any User Agent I am aware of, and appears to contrary to the intent of the design of HTML (though this is obviously mere guesswork). For intance, "SHORTTAGS YES" allows empty start ("<>") and end ("</>") tags, unclosed start ("<gi") and end ("</gi>") tags, and NET-enabling tags ("<gi/CDATA/"). I assume that this is due to the publishing schedule and implementation rate of the so-called "WebSGML Adaptations Annex" (Annex K) to the ISO SGML Standard in relation to the design and publishing of the HTML Recommendation. May I suggest you issue an erratum for the HTML 4.01 Recommendation noting that the included SGML Declaration is for compatibility concerns with common SGML systems (that no longer exist today) and that the more precise SGML Declaration would contain a FEATURES section such as: .... FEATURES MINIMIZE SHORTTAG STARTTAG EMPTY NO -- outlaws "<>" -- UNCLOSED NO -- outlaws "<foo" -- NETENABL NO -- outlaws "<p/text<em/more text/ nested/" -- ENDTAG EMPTY NO -- outlaws "</>" -- UNCLOSED NO -- outlaws "</foo" -- ATTRIB DEFAULT YES -- allows defaulted attributes -- OMITNAME YES -- allows "<gi attr>" -- VALUE YES -- allows unquoted attrs; "<gi att=val>" -- .... Other parts of the SGML Declaration might also benefit from a review in light of intent, implementations, and current practice in SGML; but these are the issues we register that authors struggle with most often (sufficiently so that several of these have become "FAQs" for the Validator Service). These are also issues that we _cannot_ detect and inform authors of, unless the SGML Declaration (or an errata to same) makes use of this more fine-grained form from Annex K. Kind Regards, Terje Bless
Received on Monday, 11 November 2002 18:33:29 UTC