W3C home > Mailing lists > Public > html-tidy@w3.org > October to December 2000

Re: Understanding Tidy

From: J. David Bryan <jdbryan@acm.org>
Date: Sat, 9 Dec 2000 01:01:03 -0500
Message-Id: <200012090601.BAA11339@mail.bcpl.net>
To: HTML Tidy List <html-tidy@w3.org>
On 8 Dec 2000, at 12:06, Howard Kaikow wrote:

> I find it confusing when Tidy tells me I cannot use (leaving out details):
> 
> [...]
> 
> Tidy wants to see a </FONT> before the <UL>.

It's important to understand that Tidy is designed to take erroneous HTML 
and produce standards-conformant HTML by using its best guess as to the 
author's intent.  When it says you "cannot use" something, it means that 
you've submitted erroneous HTML, and Tidy will attempt to produce 
conformant HTML from your submission.  The result may or may not exactly 
reflect your intention, as Tidy's heuristics aren't infallible (yet :-).

Tidy is not designed to produce erroneous HTML on output, regardless of 
whether certain browsers accept that erroneous HTML and display what an 
author intended.


> Is SGML/HTML really that restrictive?

It is "restrictive" in the sense that you must follow the rules laid out in 
the standard if you want to write valid HTML.  If you wish to write invalid 
HTML that may work on a given browser only, then Tidy simply isn't the 
right tool for the job.


> In order to understand Tidy, it would help to learn where is the formal
> specification on how tags can be nested.

The formal specifications are the DTDs (document type definitions) 
associated with a given version of HTML.  They are the "final word" 
regarding what's legal and what is not.  The HTML4 DTDs are listed in 
Chapters 21-23 of the HTML 4.01 Specification on the W3C site.

However, there are some prose descriptions of the legality rules available 
on the Web.  For example, you might refer to the Web Design Group's "HTML 
4.0 Reference" at:

  http://www.htmlhelp.com/reference/html40/

Looking at your original problem, if you click on the "Alphabetical list of 
HTML elements" link, and then click on the "FONT" link, you'll find that 
the description for this element shows:

  Contents:      Inline elements
  Contained in:  Inline elements, block-level elements except PRE 

If you then click on "Inline elements", you'll get a list of all elements 
that may be contained within a FONT element (UL isn't one of them).

This particular reference lists, for each element, what it may contain and 
what may contain it.  That may help you with your understanding of Tidy's 
diagnostics.

                                      -- Dave
Received on Saturday, 9 December 2000 01:01:14 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:44 GMT