- From: Mark Wormgoor <riddles@cistron.nl>
- Date: Thu, 25 Nov 1999 10:23:28 +0100
- To: <www-lib@w3.org>
Hi, First, I did try this. However, I would also like to download the source for XML-pages like http://slashdot.org/slashdot.xml. How can I download these if I'm not using WWW_RAW? Or is there a way to get rid of the block counts or force HTTP/1.0? Kind regards, Mark Wormgoor ----- Original Message ----- From: Roger McCalman <r.mccalman@elsevier.co.uk> To: <www-lib@w3.org> Sent: Thursday, November 25, 1999 10:14 AM Subject: Re: Problems with RAW GET > I would say that the results you are seeing are due to the HTTP 1.1 chunked > encoding. Each block is preceeded by a count. Try using WWW_SOURCE rather > than WWW_RAW. > > Cheers, Roger > > On Thu, Nov 25, 1999 at 10:04:51AM +0100, Mark Wormgoor wrote: > > Hi, > > > > Using libwww I am trying to write a small application that will fetch > > newsheaders from sites like slashdot and such. For this reason I'm using a > > raw get of the page (slashdot and freshmeat use xml). The platform is > > Redhat 6.0 with libwww 5.2.8. I'v attached test-source to the program. > > > > The problem is this. When I try to fetch the URL in the sourcecode (a > > Dutch newssite), it contains strange characters in the middle of the raw > > output, for example: > > <img src= > > 19c > > '../grafx/nw_letter_nieuws.gif' > > When I download the same page in Netscape, it prints: > > <img src='../grafx/nw_letter_nieuws.gif' > > which is the correct code. Every time I download the page, these things > > appear at the same place. When the page changes, I get different > > characters at different locations. > > > > If somebody knows what's causing this, I would really like to know. > > BTW, I compile this using: > > gcc -O6 `libwww-config --cflags` -Wall `libwww-config --libs` -o test > > test.c > > > > Kind regards, > > > > Mark Wormgoor > > > >
Received on Thursday, 25 November 1999 04:25:38 UTC