Re: W3Lib local file access problem.

> Teo Kok Hoon wrote:
> 
> >    I have been working with the library on the Windows platform for a while and 
> >would like to point out one problem I faced with local file access.
> >
> >    Looking though the source code (see "HTFile.c", the "HTLoadFile" function), 
> >I realise that all local files are opened via a call to
> >
> >	fopen( file->localname,"r" );
> >
> >My concern is that, for anchors leading to a local file, especially an in-line 
> >binary image file, this approach does not correctly complete the fetch of the
> >image data. The reason is simple, a binary file is opened as ASCII.
> 
> 
> I had similar problems with writing to local files. I am using the library with 
> Visual C++ 2.0 on Windows 95.
> 
> The HTCacheWriter stream was expanding linefeed characters into CR+LF, 
> which caused binary files to be corrupted. The easiest workaround to this problem 
> I have found was to link the main program (not the library) with the BINMODE.OBJ 
> file. After this all fopen calls (with "w" or "r") open files in the binary mode. 
> I suppose this method works for reading from local files too. 
> 
> There is however one drawback: text files are opened as binary (so my solution 
> doesn't eliminate the problem; it converts it to its opposite :-)). In many situations 
> this doesn't matter. For example, many applications for PCs deal correctly with 
> text files with LFs only (instead of CR+LF pairs).
> 
> Teo Kok Hoon's patch corrects the HTLoadFile function only, so perhaps my 
> solution can help people who have problems with writing to files (or with the 
> cache). I know it isn't perfect, but it is easy to apply and sufficient in many 
> situations.

The way to do this is to always open local files in binary mode (if the 
contents is unknown at the time when the file is opened). Log files ewtc. can 
always be opened in text mode. In most cases the data object does not have to 
be canonicalized (for example in the case of the HTCacheWriter stream) but if 
required then it can be done by inserting a small stream that does it just 
before the fiel writer stream.

There is actually a stream that does the opposite in the HTNetTxt module. This 
can easily be modified to go the other way.

The next version of the library opens the files in binary mode, however it 
currently doesn't use any canonicalization stream - I haven't had the time :-(

-- 

Henrik Frystyk Nielsen, <frystyk@w3.org>
World-Wide Web Consortium, MIT/LCS NE43-356
545 Technology Square, Cambridge MA 02139, USA

Received on Saturday, 7 October 1995 22:44:57 UTC