persistent xml-decl vs. char-encoding

I'm trying to prevent Tidy from outputting the xml declaration, because I
want it to read <?xml version="1.0" encoding="iso-8859-2"?>, and as far as
I can see, Tidy won't let me specify this encoding, so I supply the whole
line from a shell script. And, of course, setting add-xml-decl to "no"
does the job *if* I don't also specify char-encoding as "raw". (I specify
it as "raw", to prevent Tidy from mangling Latin-2 characters in the files
I process.)

So, if I use cmdline arguments, I can suppress the declaration when I do
e.g.:

tidy --output-xml yes --add-xml-decl no --tidy-mark no $1 >> $1.xml

but it stops working if I do:

tidy --output-xml yes --add-xml-decl no --char-encoding raw $1 >> $1.xml

To make things even more interesting, let me add that if I specify
char-encoding as "ascii", it works as it should...

I get the same behaviour for the versions of 1 Jan and 1 Feb.  
Additionally, the Jan version won't read my config file, apparently, and
the Feb version segfaults on the files I need to process (bug report
already posted), so I'm somewhat stuck and will gratefully accept some
advice :-) I mean, if I have to, I will transcode my files before feeding
them to Tidy, but maybe there's something about config options that I've
missed, or some upcoming fix only days (hours? ;-) ) away?

Thanks,

   Piotr

Received on Sunday, 2 February 2003 17:05:16 UTC