W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2000

RE: tidy problems on www.altavista.com

From: Alexander Biron <biron@ifh.de>
Date: Thu, 22 Jun 2000 10:58:59 +0200 (METDST)
To: Ittay Freiman <ittay@vigiltech.com>
cc: "'html-tidy@w3.org'" <html-tidy@w3.org>
Message-ID: <Pine.HPX.4.21.0006221024410.3108-100000@hpbai2.ifh.de>
On Thu, 22 Jun 2000, Ittay Freiman wrote:

> you are, of course, right. however, i need to parse files as the regular
> browser parses them, that is, as the writer of the page intended on them to
> be parsed. 

First of all, these can be two different things.
Secondly, I understand your ambition, but fear tidy is not the optimal
tool for that task - it was designed for something else.

> more than that, i think tidy is wrong here. td is an inline tag,
> while form is a block, so <td> followed by <form> should be converted to
> <td></td><form>. the same thing goes to the /form.

Form is block level, correct. TD however may contain flow. flow may
contain block level. So td may contain form. Please search
http://www.w3.org/TR/REC-html40/sgml/dtd.html to see this: 


<!ENTITY % block
                "P | %heading; | %list; | %preformatted; | DL | DIV |
	NOSCRIPT | BLOCKQUOTE | FORM | HR | TABLE | FIELDSET | ADDRESS">
i.e. <form> is block (as is table)

<!ELEMENT (TH|TD)  - O (%flow;)*       -- table header cell, table data
					cell-->
i.e. <td> may contain flow
<!ENTITY % flow "%block; | %inline;">
i.e. block is flow.

So tidy's result is syntax correct here:
<td><form></td></form> -> <td><form></form</td>

A different question is whether a different correct syntax comes closer
to what the author intended:

<form><table></table></form>

This is to my understanding a very common syntax when using
forms.

<!ELEMENT FORM - - (%block;|SCRIPT)+ -(FORM) -- interactive form -->
i.e. <form> may contain block, so the above example is also legal.


So tidy has to choose which one of the two legal syntaxes is the one the
author intended. I am not shure that the second one is _always_
the better one (in your case it certainly seems so.) But maybe the
default should be 
<td><form></td></form> -> <form><table></table></form> instead of
<td><form></td></form> -> <td><form></form</td>


-- 
Cheers alex          Alexander Biron

Support the ban of Dihydrogen Monoxide: http://www.dhmo.org/

work:	http://www.ifh.de/~biron/	private:
	Tel (+49)33762-77-483   	Tel(+49)30-4948857
	mailto:biron@ifh.de    		mailto:biron@frohnau-flamingos.de
Received on Thursday, 22 June 2000 04:59:01 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:44 GMT