Bug in http header parsing ?

Hi all,

there is something weird about  the HTAnchor_charset()  . I never get
something in it.

I use libwww-pre.5.3.0 but it was the same in 5.2.8

Please, consider the following test case :

here is a part of a code that I do not decode correctly :
(http://www.silicom.fr/sams/sams_present.htm)
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

<meta name="GENERATOR" content="Microsoft FrontPage Express 2.0">
<title>sams_present</title>
</head>

I try to collect the URL above (I should add that everything is OK
except this problem).

The following  xxgdb debugger trace shows extracted values from the
Anchor being processed in my terminate_handler.

  content_type = 0x27ac08,
  type_parameters = 0x0,
  meta_tags = 0x28cc48,
  content_base = 0x28d610 "http://www.silicom.fr/sams/sams_present.htm",

  content_encoding = 0x0,
  content_language = 0x0,
  content_length = 16047,

The content_type field of my anchor contains only text/html. The value
of the content_type is as below !?
(xxgdb) print *((HTAtom *) 0x27ac08)
$7 = {
  next = 0x0,
  name = 0x2795c8 "text/html"
}

so , obviously, no charset in it.

the meta_tags field HTAssocList only contains  "GENERATOR" and
"Microsoft FrontPage Express 2.0"

As far as I know, all these values are extracted from the Response
Object  thanks to HTAnchor_update(). So my application is not
responsible for handling with these items.

Where is my charset ????

Has anybody got the same problem ? Am I wrong in my anlysis ?Is it a
known problem ? Do you have a solution ?

Thanks a lot for your answers.

Francois.

Received on Monday, 24 July 2000 12:21:43 UTC