Possible problem and solution: if FONT is only child of parent < td> element...

======================================================
I have previously sent this message to the list.  
I'm re-sending it with sample input and output
files attached.  Sorry for the duplicate messages.
======================================================

Hi-

I've been enthusiastically using your HTML Tidy for a few
weeks now, and I think it's a great utility.  It's really
helped me to find lots of mistakes in my HTML files.

I have noticed though, the following possible problem.
If a <FONT> tag is the only child of a <TD> tag, the font
specification will get dropped.
 
Consider the following sample HTML File with a single table,
containing a single row, with a single element, with a 
specified font (file attached as test.htm):

 
=test.htm=============================================
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<title></title>
</head>
<body>
<table>
<tr>
<td><font face="arial" size=5>first font</font></td>
</tr>
</table>
</body>
</html>
======================================================

 
After running though Tidy with the following config file
(file attached as config.txt):


=config.txt===========================================
// config file for HTML tidy
markup: yes
clean: yes
show-warnings: yes
======================================================

 
I get the following (incorrect?) output
(file attached as testout1.htm):

 
=testout1.htm=========================================
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<title></title>
</head>
<body>
<table>
<tr>
<td>first font</td>
</tr>
</table>
</body>
</html>
======================================================

 
Note that my font specification for the table element
has been removed.  Ideally, I think something like the 
following should have been generated (note that the 
font specification was converted to a class and assigned
to the parent <td> element)
(file attached as testout2.htm):

 
=testout2.htm=========================================
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<title></title>
<style type="text/css">
 td.c1 {font-family: arial; font-size: 150%}
</style>
</head>
<body>
<table>
<tr>
<td class="c1">first font</td>
</tr>
</table>
</body>
</html>
======================================================

 
Less ideal, but still acceptable would have been the
following (note that the font specification for the
table element has been converted to a <span> and class):

======================================================
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<title></title>
<style type="text/css">
 span.c1 {font-family: arial; font-size: 150%}
</style>
</head>
<body>
<table>
<tr>
<td><span class="c1">first font</span></td>
</tr>
</table>
</body>
</html>
======================================================

 
Is there an option to achieve either of the desired results
above?  From looking at the source, this seems to have been
an intentional "effect" of the Font2Span() clean-up
code which discards a <FONT> tag that is the only child 
of *any* parent element.
 
As a follow-up, adding the content model CM_TABLE to the 
tag <TD> in tags.c results in what I believe 
to be the correct output (see attached file testout2.htm).

in tags.c:

"td", VERS_FROM32, (CM_ROW|CM_OPT|CM_NO_INDENT), ParseBlock, null,

was changed to:

"td", VERS_FROM32, (CM_TABLE|CM_ROW|CM_OPT|CM_NO_INDENT), ParseBlock, null,


However, since I am very new to Tidy, I'm not sure 
what this breaks.  In my limited testing, everything 
else seems to be un-affected.

Can you confirm this or offer any other suggestions?

Thanks.
-Shane

Received on Thursday, 21 October 1999 14:24:20 UTC