Hi Matt, Can you send a sample file w/ config? 0x98 is either an illegal character or a Windows 1251 "small tilde" (should be translated to U+02DC). Can you reproduce the problem with the command line tool? If so, then we can treat it as a Tidy issue. Otherwise, my bad. There is still a spurious extra newline (0xD) problem with some encodings. May be related. take it easy, Charlie At 06:26 PM 4/3/2003 +0100, Matthew Stanfield wrote: >Hi, > >When tidying html and outputting as xml, there is a symbol that is >appearing at the start of my XML files, ascii value is 0x98. How do I stop >it appearing? > >I assume this is the 'unicode Byte Order Mark character' that is mentioned >in the Tidy configuration options reference. However if I set 'output-bom' >to false the symbol still appears. I've tried using various char encodings >setting all these char-encoding, input-encoding, output-encoding to: >ascii, latin1, raw, and utf16 --the character is always there regardless >of what encoding I use. > >The char is stopping tidy output as xml from being read correctly by >.net's C# XPathDocument class. When I manually remove the char all works fine. > >I am using Charles Reitzel's COM/ATL dll. > >Many thanks and regards, > >..matthewReceived on Thursday, 3 April 2003 12:46:44 UTC
This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:06:49 UTC