- From: (unknown charset) Dave Raggett <dsr@w3.org>
- Date: Sat, 16 Sep 2000 18:48:43 +0100 (GMT Daylight Time)
- To: (unknown charset) Bjoern Hoehrmann <derhoermi@gmx.net>
- cc: (unknown charset) Mikael Ståldal <d96-mst-ingen-reklam@d.kth.se>, html-tidy@w3.org
On Mon, 4 Sep 2000, Bjoern Hoehrmann wrote: > * "Mikael Ståldal" <d96-mst-ingen-reklam@d.kth.se> wrote: > | When using HTML Tidy with the options -asxml -latin1, it doesn't output > | > | <?xml version="1.0" encoding="iso-8859-1"?> > | > | as it should in order to produce well-formed XML. Without the encoding > | specification, an XML parser will assume UTF-8. > > Use '--add-xml-decl yes' but i agree, that tidy should do this > automatically (if there are iso-8859-1 characters in the file. > If all chars are encoded as entities it isn't necessary, > beacause the file is us-ascii and us-ascii is a subset of utf-8, > the default encoding of XML files.) I have modified AdjustConfig() in config.c and the misnamed FixXMLPI() in lexer.c to deal with this. This feature will be available in the next release, as further thought is needed on dealing with say Microsoft Windows specific encodings. Regards, -- Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett tel/fax: +44 122 578 3011 (or 2521) +44 778 532 0444 (mobile) World Wide Web Consortium (on assignment from HP Labs)
Received on Saturday, 16 September 2000 13:48:52 UTC