W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2002

Re: Tags lacking a terminating '>' are spotted

From: Richard A. O'Keefe <ok@atlas.otago.ac.nz>
Date: Thu, 14 Feb 2002 12:19:33 +1300 (NZDT)
Message-Id: <200202132319.MAA396731@atlas.otago.ac.nz>
To: html-tidy@w3.org
Dave Raggett <dsr@w3.org> wrote:
	Thanks for jogging my memory. The HTML working group wanted to
	support the widespread practice of omitting quote marks around
	attribute values when these didn't include spaces. This forced
	SHORTTAG YES even though none of the browsers supported the other
	features implied by SHORTTAG YES.
	
By Jove, he's right:
    "An attribute value specification can be an attribute value (that is,
    not an attribute value literal) only if it contains nothing but name
    characters and either:
    a) it occurs in an attribute definition list; or
    b) "SHORTTAG YES" is specified on the SGML declaration."

Why on _earth_ would the SGML designers make this entirely harmless feature
depend on SHORTTAG?  As far as I can see, the only benefit is to make SGML
parsers slower (because they'd have to make an otherwise unnecessary check
of SHORTTAG) and to link this feature with an extremely dangerous one (<>).

However, the fact that they _are_ linked means that people who develop
HTML pages using an SGML parser to check them for validity (like me) still
need to use HTML Tidy to check that <>, </>, <x//, <x<y> and so on are not
used, because an SGML parser won't complain about them.  Note in particular
that Emacs' is a good HTML editor, but uses an SGML parser, so it's quite
easy to slip in the odd <em/item/ that should be tidied.

This suggests that it would be a good thing for Tidy to recognise and
eliminate at least some of the SHORTTAG features.
Received on Wednesday, 13 February 2002 18:19:36 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:51 GMT