W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2003

Annoying Symbol At Start Of XML Outputted File.

From: Matthew Stanfield <mattstan@blueyonder.co.uk>
Date: Thu, 03 Apr 2003 18:26:33 +0100
Message-ID: <3E8C6EC9.7010708@blueyonder.co.uk>
To: html-tidy <html-tidy@w3.org>, html-tidy-developers <tidy-develop@lists.sourceforge.net>
Cc: Charles Reitzel <creitzel@rcn.com>


When tidying html and outputting as xml, there is a symbol that is 
appearing at the start of my XML files, ascii value is 0x98. How do I stop 
it appearing?

I assume this is the 'unicode Byte Order Mark character' that is mentioned 
in the Tidy configuration options reference. However if I set 'output-bom' 
to false the symbol still appears. I've tried using various char encodings 
setting all these char-encoding, input-encoding, output-encoding to: ascii, 
latin1, raw, and utf16 --the character is always there regardless of what 
encoding I use.

The char is stopping tidy output as xml from being read correctly by .net's 
C# XPathDocument class. When I manually remove the char all works fine.

I am using Charles Reitzel's COM/ATL dll.

Many thanks and regards,

Received on Thursday, 3 April 2003 12:49:58 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:53 UTC