Re: Odd validator behaviour

Simon Wilkinson <sxw@dcs.ed.ac.uk> wrote:

> The following is a minimal HTML code snippet which exhibits odd
> behviour in the W3 Validator. (The snippet is also available
> at http://www.dcs.ed.ac.uk/home/sxw/broken.html ) 
		(snip)
> The important line is the one beginning <H2>.
> 
> I believe this snippet to be incorrect - the <IMG tag
> never ends, and the </H2> gets gobbled - Netscape at least handles
> it like that.
> 
> However, given Netscape's ropy HTML parsing, and my limited knowledge
> of SGML - I'd believe that nsgmls gets it right - and that this is
> correct. But could someone explain to me why?

Take a look at:

	http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.3.7

In short, the SHORTTAG feature of SGML allows you to omit closing `>'
in this context.  But as you noticed, many existing HTML tools will
not handle such feature correctly, and such shorthand is not available
in XML (e.g. XHTML).

If you want to check/correct such dubious markup, you'd better use
specialized tools for HTML, such as HTML TIDY utility, together with
SGML validation.  TIDY is available at:

	http://www.w3.org/People/Raggett/tidy/

Regards,
-- 
Masayasu Ishikawa / mimasa@w3.org
W3C - World Wide Web Consortium

Received on Monday, 26 April 1999 00:28:17 UTC