[Bug 12539] The numeric references to produce the gyphs in the third column should use the characters listed in the second columns. lang (and aliases) list (correctly) U+27ea, but the glyph is produced by #9001 (U+2329) which is not in normal form C and generates va

https://www.w3.org/Bugs/Public/show_bug.cgi?id=12539

--- Comment #9 from David Carlisle <davidc@nag.co.uk> 2011-12-21 10:30:56 UTC ---
(In reply to comment #8)
> As far as I can tell, it's a python bug or maybe lxml bug. I think the simplest
> way to deal with it would be to have anolis and/or the splitter script do the
> s/#9001;/#x27E8;/g and s/#9002;/#x27E9;/ -- or run some post-processing script
> (perl or sed or python or whatever) on the anolis/splitter output to do it.


"bug" is probably a bit harsh, probably fairer to say you're processing the
html(5) spec with an html4 parser, but it comes to the same thing, those entity
references get the old/wrong values.

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Wednesday, 21 December 2011 10:30:58 UTC