XML empty-element syntax in SGML HTML documents

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I was a bit startled to find an HTML 4.01 Transitional document passing the 
validator using <img /> syntax.

I finally figured out why - it is valid SGML (of course).  However, it 
definitely doesn't mean what the author thought: it means an img tag ('<img 
/') followed by a greater-than in character data.

I don't expect the SGML parser to catch this, however, it might be a good 
idea for the validator to flag any use of NET in a non-XML document.

There's a post at <URL: 
http://lists.w3.org/Archives/Public/www-validator/2002Feb/0151.html > which 
shows awareness of the issue; however, it's inaccurate.  The <link /> 
syntax is *not* legal, as it dumps a > in character data inside the head, 
where it's not allowed.

I realize we can't turn SHORTTAG off, but a post-validation analysis is 
probably warranted.

~Chris
- -- 
Christopher R. Maden, Principal Consultant, crism consulting
DTDs/schemas - conversion - ebooks - publishing - Web - B2B - training
<URL: http://crism.maden.org/consulting/ >
PGP Fingerprint: BBA6 4085 DED0 E176 D6D4  5DFC AC52 F825 AFEC 58DA
-----BEGIN PGP SIGNATURE-----
Version: PGP Personal Privacy 6.5.8

iQA/AwUBPOyIIKxS+CWv7FjaEQIXTgCgg73Oot3AvEa2eh9L0icVNIgAEVEAnjNT
tIlrgR1QbT8iwYAmZjmTQaRV
=vZWI
-----END PGP SIGNATURE-----

Received on Thursday, 23 May 2002 02:20:02 UTC