- From: Charles Reitzel <creitzel@rcn.com>
- Date: Thu, 03 Apr 2003 12:46:01 -0500
- To: Matthew Stanfield <mattstan@blueyonder.co.uk>
- Cc: html-tidy <html-tidy@w3.org>
Hi Matt, Can you send a sample file w/ config? 0x98 is either an illegal character or a Windows 1251 "small tilde" (should be translated to U+02DC). Can you reproduce the problem with the command line tool? If so, then we can treat it as a Tidy issue. Otherwise, my bad. There is still a spurious extra newline (0xD) problem with some encodings. May be related. take it easy, Charlie At 06:26 PM 4/3/2003 +0100, Matthew Stanfield wrote: >Hi, > >When tidying html and outputting as xml, there is a symbol that is >appearing at the start of my XML files, ascii value is 0x98. How do I stop >it appearing? > >I assume this is the 'unicode Byte Order Mark character' that is mentioned >in the Tidy configuration options reference. However if I set 'output-bom' >to false the symbol still appears. I've tried using various char encodings >setting all these char-encoding, input-encoding, output-encoding to: >ascii, latin1, raw, and utf16 --the character is always there regardless >of what encoding I use. > >The char is stopping tidy output as xml from being read correctly by >.net's C# XPathDocument class. When I manually remove the char all works fine. > >I am using Charles Reitzel's COM/ATL dll. > >Many thanks and regards, > >..matthew
Received on Thursday, 3 April 2003 12:46:44 UTC