- From: Terry Teague <teague@mailandnews.com>
- Date: Thu, 30 Nov 2000 00:23:19 -0800
- To: html-tidy@w3.org
Dear Folks,
A user reported to me that Tidy was incorrectly reporting a missing table
summary for HTML 3.2 documents. I did some investigation, and reproduced
the problem with both the 04 Aug 00 Mac OS and Windows versions of Tidy.
Sample HTML :
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html>
<head>
<title>Some title</title>
</head>
<body>
<table>
<tr>
<td>
Missing table summary only applies to HTML 4.x
</td>
</tr>
</table>
</body>
</html>
Running this through Tidy gives :
Tidy (vers 4th August 2000) Parsing console input (stdin)
line 7 column 1 - Warning: <table> lacks "summary" attribute
stdin: Doctype given is "-//W3C//DTD HTML 3.2 Final//EN"
stdin: Document content looks like HTML 3.2
1 warnings/errors were found!
I looked at the Tidy code, and saw the following pieces :
attrs.c :
{"summary", VERS_HTML40, TEXT}, /* TABLE */
...
/* suppress warning for missing summary for HTML 2.0 and HTML 3.2 */
if (!HasSummary && lexer->doctype != VERS_HTML20 && lexer->doctype !=
VERS_HTML32)
{
lexer->badAccess |= MISSING_SUMMARY;
ReportAttrError(lexer, node, "summary", MISSING_ATTRIBUTE);
}
...
lexer.c :
{"HTML 3.2", "XHTML 1.0 Transitional", voyager_loose, VERS_HTML32},
...
/* compute length of identifier e.g. "HTML 2.0" */
for (j = i + 14; j < doctype->end && lexer->lexbuf[j] !=
'/'; ++j);
len = j - i - 14;
s = W3C_Version[0].name;
if (len == wstrlen(s) && wstrncmp(p, s, len) == 0)
return W3C_Version[0].code;
...
/* make a note of the version named by the doctype */
lexer->doctype = FindGivenVersion(lexer, lexer->token);
Basically what happens is that FindGivenVersion() in lexer.c sets
lexer->doctype to VERS_UNKNOWN, rather than VERS_HTML32, because the
wstrncmp() fails to match "HTML 3.2 Final" to "HTML 3.2". But ironically
ApparentVersion() in lexer.c figures out (from lexer->versions) that it is
dealing with HTML 3.2 code.
According to <http://www.w3.org/TR/REC-html32>, the doctype for HTML 3.2
should be specified as shown in the sample.
If I however change the sample HTML doctype to be :
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
this causes the incorrect table summary problem to go away.
I see a couple of choices to fix this (and other potential HTML 3.2 problems) :
1) Change the string constant in lexer.c
from :
{"HTML 3.2", "XHTML 1.0 Transitional", voyager_loose, VERS_HTML32},
to :
{"HTML 3.2 Final", "XHTML 1.0 Transitional", voyager_loose, VERS_HTML32},
2) Change FindGivenVersion() in lexer.c
from :
s = W3C_Version[0].name;
if (len == wstrlen(s) && wstrncmp(p, s, len) == 0)
return W3C_Version[0].code;
to something like :
s = W3C_Version[0].name;
if (wstrncmp(p, s, wstrlen(s)) == 0)
return W3C_Version[0].code;
Comments?
Regards, Terry
Received on Thursday, 30 November 2000 03:24:43 UTC