- From: Roger McCalman <r.mccalman@elsevier.co.uk>
- Date: Thu, 25 Nov 1999 09:14:04 +0000
- To: www-lib@w3.org
I would say that the results you are seeing are due to the HTTP 1.1 chunked encoding. Each block is preceeded by a count. Try using WWW_SOURCE rather than WWW_RAW. Cheers, Roger On Thu, Nov 25, 1999 at 10:04:51AM +0100, Mark Wormgoor wrote: > Hi, > > Using libwww I am trying to write a small application that will fetch > newsheaders from sites like slashdot and such. For this reason I'm using a > raw get of the page (slashdot and freshmeat use xml). The platform is > Redhat 6.0 with libwww 5.2.8. I'v attached test-source to the program. > > The problem is this. When I try to fetch the URL in the sourcecode (a > Dutch newssite), it contains strange characters in the middle of the raw > output, for example: > <img src= > 19c > '../grafx/nw_letter_nieuws.gif' > When I download the same page in Netscape, it prints: > <img src='../grafx/nw_letter_nieuws.gif' > which is the correct code. Every time I download the page, these things > appear at the same place. When the page changes, I get different > characters at different locations. > > If somebody knows what's causing this, I would really like to know. > BTW, I compile this using: > gcc -O6 `libwww-config --cflags` -Wall `libwww-config --libs` -o test > test.c > > Kind regards, > > Mark Wormgoor >
Received on Thursday, 25 November 1999 04:14:15 UTC