HTString.c : possible bug in HTNextField ?

In an extension I am writing, I need to parse a file format which contains
strings that are embraced by double quotes ("), like "this string here" .
For this purpose, I thought that the HTNextField function provided in
HTString.c would come in handy, as it claims (according to my reading) that
it can deal with embracing double quotes, as well as some other options.
However, trying it out, it did not seem to work at all. All the string
fields in quotes resulted in NULL pointers, as if the fields were
non-existent. Leaving away the quotes immediately produced the right
strings.

Looking at the code of HTNextField, it seems clear what is happening. When
the first double quote is hit, the code runs to the next double quote, but
then does not exit and return this result, but continues in the loop,
thereby loosing the contents embraced within the quotes, until it finally
runs off the end of the line entirely. A grep in the
Library/Implementation/ directory shows 21 occurances where HTNextField
seems to be used. I do not know whether any of these places uses the quote
feature (I doubt it), and it could well be that this feature has not been
well tested, if at all.

The following is a patch that fixes the double quote problem for me in the
libwww 3.1 . The quoting feature with < and > seems to be similarly broken,
but I did not try out nor fix it. I will leave this to somebody who is more
knowledgeable about this matter, and who might know more of the history
behind the HTNextField function, and whether my modification breaks
something else or not.

--- HTString.c.~1~      Thu Jul 13 15:41:06 1995
+++ HTString.c  Tue Oct 10 00:10:22 1995
@@ -9,6 +9,7 @@
 **     02-Dec-91 (JFG) Added stralloccopy and stralloccat
 **     23 Jan 92 (TBL) Changed strallocc* to 8 char HTSAC* for VM and suchlike
 **      6 Oct 92 (TBL) Moved WWW_TraceFlag in here to be in library
+**      9 Oct 95 (KR)  fixed problem with double quotes in HTNextField
 */

 /* Library include files */
@@ -177,6 +178,7 @@
            start = ++p;
            for(;*p && *p!='"'; p++)
                if (*p == '\\' && *(p+1)) p++;         /* Skip escaped chars */
+           break;                          /* kr95-10-9: needs to stop here */
        } else if (*p == '<') {                              /* quoted field */
            for(;*p && *p!='>'; p++)
                if (*p == '\\' && *(p+1)) p++;         /* Skip escaped chars */



Greetings
Markus Krummenacker

Received on Saturday, 14 October 1995 21:43:25 UTC