Re: Trying out SVG and MathML parsing

Henri Sivonen wrote:
> 
> There was some discussion about SVG parsing on IRC today. Since I 
> happened to have something almost ready, I figured I'd put a build out 
> there before I head to Midsummer/St.John festivities (national holiday; 
> big deal over here). [...]

I tried this by taking a few hundred random SVG files from Wikipedia, 
passing them through html2xml to produce XHTML output, then visually 
comparing against the originals. The problems I noticed are:

* Lots have attributes like xlink:href and sodipodi:version and 
i:vieworigin, which make html2xml's output ill-formed since it doesn't 
provide an appropriate xmlns. (This may have masked other problems from 
me, since it made most of the images unviewable.)

* HTML5's treatment of <font> (i.e. exiting from the SVG mode) breaks a 
number of images:

http://upload.wikimedia.org/wikipedia/en/b/b5/Lindos5.svg
http://upload.wikimedia.org/wikipedia/en/1/17/Nilt-Political_Attitudes-NIRELAND-2006.svg
http://upload.wikimedia.org/wikipedia/en/b/be/PersCorpINtax_wi_5.svg
http://upload.wikimedia.org/wikipedia/en/4/40/Telecom.svg

* In many cases, Illustrator's fancy doctype tricks like:

   <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
     "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd" [
       <!ENTITY ns_svg "http://www.w3.org/2000/svg">
       <!ENTITY ns_xlink "http://www.w3.org/1999/xlink">
   ]>
   <svg xmlns="&ns_svg;" xmlns:xlink="&ns_xlink;" ...

make the text "]>" appear in the <body> (because HTML5 breaks out of the 
doctype when it sees the first '>'). (Also it triggers 
<http://bugzilla.validator.nu/show_bug.cgi?id=255>.)

* Often the sizes seem to get broken so the SVG-in-XHTML image is tiny 
or huge, e.g. 
<http://upload.wikimedia.org/wikipedia/en/6/69/CDGlogo.svg> vs 
<http://philip.html5.org/misc/CDGlogo.xhtml>. (I don't know enough about 
SVG sizing to understand why this problem occurs.)

* The "gradientUnits" attribute is converted into "gradientunits" which 
doesn't work, breaking 
<http://upload.wikimedia.org/wikipedia/en/5/54/Microsoft_Windows_XP_Logo.svg>. 
(<http://bugzilla.validator.nu/show_bug.cgi?id=256>.)

Apart from those problems, it appeared to work fine on that SVG content.

-- 
Philip Taylor
pjt47@cam.ac.uk

Received on Thursday, 19 June 2008 18:48:23 UTC