[Bug 3164] non SGML character number 128-159

http://www.w3.org/Bugs/Public/show_bug.cgi?id=3164





------- Comment #7 from bjoern@hoehrmann.de  2007-04-29 22:46 -------
(In reply to comment #4)
> Bjoern, any idea when the new version with the spo bug fix would be in?

I don't know what bug I was talking about here so I can't comment on that. For
the Wikipedia document HEAD seems to emit garbage (ISO-8859-1 encoded chars in
a UTF-8 encoded document). My guess is that the source code does not have the
utf-8 bit on, or that the output stream is not marked as utf-8, or something
along those lines. It seems I did release a new spo version after my comment,
so I suppose its one of

  - fixed a bug in how parse_string handles encodings
  - fixed a bug in handling warnings(qw/multiple args/)

That you get errors for C1 characters is due to xml.dcl which has

  CHARSET
    DESCSET
      128 32 UNUSED

which is from http://www.w3.org/TR/NOTE-sgml-xml-971215 and probably wrong.

Received on Sunday, 29 April 2007 22:46:37 UTC