W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2000

Re: About Tidy Parser Grammar

From: Dave Raggett <dsr@w3.org>
Date: Fri, 24 Mar 2000 11:45:29 -0600
To: Shivani <ssud@Adobe.COM>
Cc: html-tidy@w3.org
Message-ID: <OF1E12B50B.286EE7CB-ON86256862.00436EA3@rfdinc.com>

On Tue, 4 Jan 2000, Shivani wrote:

> I have been following the tidy-parser list for a some time now -
> I am interested in the BNF or equivalent grammar for the tidy
> parser - I could not find it on the w3-tidy site (maybe I missed
> it ) Can anyone help with a link to the same? or with where I
> can find it?

The formal grammar for HTML is defined by the HTML specifications,
see http://www.w3.org/MarkUp. Unfortunately, a great many documents
and document generation tools don't conform to these specifications.
Tidy is a kind of expert system that knows lots of rules for fixing
up real-world html with a pretty printer for outputting the result.
The code uses tables for tags and attributes to invoke the
appropriate processing methods, but the complexities of the problems
it has to deal with preclude a description of the parser using BNF.

Regards,

-- Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett
tel/fax: +44 122 578 3011 (or 2521) +44 385 320 444 (mobile)
World Wide Web Consortium (on assignment from HP Labs)
Received on Friday, 24 March 2000 13:14:39 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:43 GMT