- From: Advocate <wotw@inebraska.com>
- Date: Fri, 20 Aug 1999 17:58:11 -0500
- To: <html-tidy@w3.org>
In working on changes to HTML Tidy, I encountered a problem stemming from the fact that tag names are now stored in lowercase form within HTML Tidy whereas before they were uppercase. I had assumed that tidy was using wstrcasecmp for comparisons of tag names and attribute names, particularly in the lookup() functions for tag and attribute names. Apparently this is not the case. At first I thought this could be remedied by patching the relevant lookup() functions to use wstrcasecmp instead of wstrcmp, but now doing a grep of the source I see many many cases where wstrcmp is being used where wstrcasecmp is much more appropriate and less error prone. Affected files include attrs.c, clean.c, entities.c, lexer.c, parser.c, and tags.c. (Clean has many tests for a "style" attribute using wstrcmp.) Though HTML Tidy is good about making sure it stores all tags and attribute names in a single case, other software interfacing with HTML Tidy on a code level may not be as careful (especially since having them in uppercase makes them easy to differentiate them from other text where the case cannot be reversed, many would be tempted to use uppercase). I will put together a diff applied to a virgin 26Jul99 that will address these later tonight and post it to the list. -- ,=<#)-=# <http://www.war-of-the-worlds.org/> ,_--//--_, _-~_-(####)-_~-_ "Did you see that Parkins boy's body in the tunnels?" "Just (#>_--'~--~`--_<#) the photos. Worst thing I've ever seen; kid had no face."
Received on Friday, 20 August 1999 19:04:19 UTC