W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2001

Re: badly formed end tags

From: Richard A. O'Keefe <ok@atlas.otago.ac.nz>
Date: Fri, 11 May 2001 14:28:11 +1200 (NZST)
Message-Id: <200105110228.OAA285940@atlas.otago.ac.nz>
To: Adrian.Lester@openwave.com, html-tidy@w3.org
	Would it be a major piece of work to get tidy to recognise whitespace within
	tags?  Is that valid HTML that Tidy should be trying to cope with, or is it
	badly written HTML.
	
Badly written HTML.

End-tag ::= '</' name white-space* '>'

Start-tag ::= '<' name (white-space+ attribute-name white-space*
                        ('=' white-space* attribute-value)?)*
			white-space* '>'

White space characters, including line breaks, are allowed before the
closing '>', not NOT after the opening '<' or '</'.

Strictly speaking, in SGML, if you have a '</' that is not followed by
a letter, it is character data.  If you have a '<' that is not followed
by a letter, slash, bang, or question mark, that's data.
So
    Here < I > stand and < em > can do other </ i >.
is all data with not a tag in sight.
(This isn't the full truth, but it's right for the SGML features that
HTML selects.)

In XHTML, a '<' that is not followed by a letter, slash, bang, or question
mark, or a '</' that is not followed by a letter, is flat-out illegal.
Received on Thursday, 10 May 2001 22:28:35 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:45 GMT