W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 1999

Many instances of wstrcmp should be changed to wstrcasecmp

From: Advocate <wotw@inebraska.com>
Date: Fri, 20 Aug 1999 17:58:11 -0500
Message-Id: <v04205502b3e388b72c58@[206.222.216.26]>
To: <html-tidy@w3.org>
In working on changes to HTML Tidy, I encountered a problem stemming 
from the fact that tag names are now stored in lowercase form within 
HTML Tidy whereas before they were uppercase.  I had assumed that 
tidy was using wstrcasecmp for comparisons of tag names and attribute 
names, particularly in the lookup() functions for tag and attribute 
names.

Apparently this is not the case.  At first I thought this could be 
remedied by patching the relevant lookup() functions to use 
wstrcasecmp instead of wstrcmp, but now doing a grep of the source I 
see many many cases where wstrcmp is being used where wstrcasecmp is 
much more appropriate and less error prone.  Affected files include 
attrs.c, clean.c, entities.c, lexer.c, parser.c, and tags.c.  (Clean 
has many tests for a "style" attribute using wstrcmp.)

Though HTML Tidy is good about making sure it stores all tags and 
attribute names in a single case, other software interfacing with 
HTML Tidy on a code level may not be as careful (especially since 
having them in uppercase makes them easy to differentiate them from 
other text where the case cannot be reversed, many would be tempted 
to use uppercase).

I will put together a diff applied to a virgin 26Jul99 that will 
address these later tonight and post it to the list.
-- 
          ,=<#)-=#  <http://www.war-of-the-worlds.org/>
     ,_--//--_,
  _-~_-(####)-_~-_  "Did you see that Parkins boy's body in the tunnels?" "Just
(#>_--'~--~`--_<#)  the photos.  Worst thing I've ever seen; kid had no face."
Received on Friday, 20 August 1999 19:04:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:42 GMT