W3C home > Mailing lists > Public > www-validator@w3.org > May 2002

Re: XML empty-element syntax in SGML HTML documents

From: Nick Kew <nick@webthing.com>
Date: Thu, 23 May 2002 18:33:56 +0100 (BST)
To: "Christopher R. Maden" <crism@maden.org>
cc: <www-validator@w3.org>
Message-ID: <20020523182734.F6343-100000@fenris.webthing.com>

On Wed, 22 May 2002, Christopher R. Maden wrote:

> I was a bit startled to find an HTML 4.01 Transitional document passing the
> validator using <img /> syntax.
>
> I finally figured out why - it is valid SGML (of course).  However, it
> definitely doesn't mean what the author thought: it means an img tag ('<img
> /') followed by a greater-than in character data.
>
> I don't expect the SGML parser to catch this, however, it might be a good
> idea for the validator to flag any use of NET in a non-XML document.

This is something that's come up quite frequently on this list.

> There's a post at <URL:
> http://lists.w3.org/Archives/Public/www-validator/2002Feb/0151.html > which
> shows awareness of the issue; however, it's inaccurate.  The <link />
> syntax is *not* legal, as it dumps a > in character data inside the head,
> where it's not allowed.

Actually it's worse than that.  The character data implicitly closes
the HEAD and opend the BODY.  Leads to *very* confusing error reports,
and one of many reasons to prefer Strict over Legacy^H^HTransitional.

> I realize we can't turn SHORTTAG off,

Yes we can - OpenSP supports it as a warning ( -wunclosed on the
commandline).  You get that from the recommended parse mode of
Page Valet, or with Warnings enabled in the WDG validator.

-- 
Nick Kew

Available for contract work - Programming, Unix, Networking, Markup, etc.
Received on Thursday, 23 May 2002 14:11:45 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:03 GMT