Re: What namespace features popular SVG tools really emit (ISSUE-37)

On Aug 4, 2008, at 18:20, Henri Sivonen wrote:

> NS errors: 0.0036495215831587324
> Encoding errors: 7.460691481755501E-5
> Other WF errors: 1.6164831543803585E-4

It appears that files that fail to parse as XML can be successfully  
uploaded to the Commons. Here we see that errors on the namespace  
layer (unbound prefixes or attempts to bind a prefix to the empty  
string) are relatively far more common than errors on the XML 1.0 layer.

I used Xerces-J, which treated Namespace-layer errors as fatal.

> ----------------
> Total: 160218

This first batch of output contains results for files from all tools.

Below, all the floats are ratios of number *files* containing a given  
trait against the total number of files. That is, a file exhibiting a  
trait twice counts only once.

> nonSvgRoot: 0.000000

There were no files having root element with a local name other than  
'svg'.

> nonNamespaceSvgRoot: 0.012202

Over 1% of the files had the root element in no namespace. (Note that  
I had configured the entity resolver of the XML parser to resolve or  
external DTDs to an empty stream in order to approximate what browsers  
do.)

The Adobe SVG plug-in at least used to render files like this.  
Firefox  doesn't support files like this. According to http://www.w3.org/mid/op.ue2v5dj2wxe0ny@widsith.local 
  , Opera has knowingly removed support for files like this without an  
outcry.

I'd like to re-emphasize, that the figure is over 1%!

> otherNamespaceSvgRoot: 0.000006

Apparently someone has sometime typoed the NS URI.

> hasFlowRoot: 0.006541

6.5‰ of the files contain a flowRoot element. (heycam asked for this  
data.)

> hasDoctype: 0.237558

24% of the files have a doctype.

> hasInternalSubset: 0.054513

5% declares something in the internal subset.

> hasMetadata: 0.559070

56% has child elements in the metadata element.

> hasStyleAttribute: 0.816338
> hasPresentationAttributes: 0.372199
> hasStyleElement: 0.102816

The style attribute is the most popular way to style SVG.

> hasDefinitionElementsOutsideDefs: 0.209852

This is also for heycam.

> prefixedSvgElements
> ANY: 0.003227

3‰ of the files have prefixed elements in the SVG namespace. That's 3  
per*mille*. The SVG WG's proposal requires support for this case, but  
requires failure in the case of when the namespace isn't declared at  
all, which accounts for over 1 per*cent*.

> http://www.w3.org/2000/svg svg defs: 0.003214
> http://www.w3.org/2000/svg svg svg: 0.003214
> http://www.w3.org/2000/svg svg path: 0.003127
> http://www.w3.org/2000/svg svg g: 0.002852
> http://www.w3.org/2000/svg svg metadata: 0.002465
> http://www.w3.org/2000/svg svg rect: 0.001766
> http://www.w3.org/2000/svg svg polygon: 0.001404
> http://www.w3.org/2000/svg svg linearGradient: 0.001248
> http://www.w3.org/2000/svg svg stop: 0.001248
> http://www.w3.org/2000/svg svg polyline: 0.001223
> http://www.w3.org/2000/svg svg text: 0.001217
> http://www.w3.org/2000/svg svg tspan: 0.001192
> http://www.w3.org/2000/svg svg foreignObject: 0.001186
> http://www.w3.org/2000/svg svg radialGradient: 0.001130
> http://www.w3.org/2000/svg svg line: 0.001036
> http://www.w3.org/2000/svg svg clipPath: 0.000836
> http://www.w3.org/2000/svg svg pattern: 0.000830
> http://www.w3.org/2000/svg svg desc: 0.000768
> http://www.w3.org/2000/svg svg circle: 0.000362
> http://www.w3.org/2000/svg svg style: 0.000212
> http://www.w3.org/2000/svg svg switch: 0.000169
> http://www.w3.org/2000/svg svg ellipse: 0.000137
> http://www.w3.org/2000/svg svg xpacket: 0.000100
> http://www.w3.org/2000/svg svg marker: 0.000094
> http://www.w3.org/2000/svg svg use: 0.000075
> http://www.w3.org/2000/svg svg symbol: 0.000056
> http://www.w3.org/2000/svg svg font: 0.000044
> http://www.w3.org/2000/svg svg font-face: 0.000044
> http://www.w3.org/2000/svg svg glyph: 0.000044
> http://www.w3.org/2000/svg svg missing-glyph: 0.000044
> http://www.w3.org/2000/svg svg filter: 0.000037
> http://www.w3.org/2000/svg svg feGaussianBlur: 0.000025
> http://www.w3.org/2000/svg svg mask: 0.000019
> http://www.w3.org/2000/svg svg feDiffuseLighting: 0.000012
> http://www.w3.org/2000/svg svg feDistantLight: 0.000012
> http://www.w3.org/2000/svg svg image: 0.000012
> http://www.w3.org/2000/svg svg midPointStop: 0.000012
> http://www.w3.org/2000/svg svg flowPara: 0.000006
> http://www.w3.org/2000/svg svg flowRegion: 0.000006
> http://www.w3.org/2000/svg svg flowRoot: 0.000006
> http://www.w3.org/2000/svg svg flowSpan: 0.000006
> http://www.w3.org/2000/svg svg pgfRef: 0.000006
> http://www.w3.org/2000/svg svg textPath: 0.000006
> http://www.w3.org/2000/svg svg title: 0.000006

The only prefix ever used for elements from the SVG namespace is 'svg'.

> foreignElementsInMetadata
> ANY: 0.559070
...
> http://creativecommons.org/ns# ns Work: 0.001280
> http://www.w3.org/1999/02/22-rdf-syntax-ns# rdf Seq: 0.001080
> xpacket: 0.000886
...

Lots of stuff in <metadata>.

Most of it is prefixed, but e.g. xpacket above isn't.

> &ns_vars; ns sampleDataSets: 0.000012

Looks like round-tripping of Illustrator output isn't quite working...

> foreignElementsElsewhere
...
> http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd sodipodi  
> namedview: 0.461396
> http://inkscape.sourceforge.net/DTD/sodipodi-0.dtd sodipodi  
> namedview: 0.085521
> http://www.inkscape.org/namespaces/inkscape inkscape perspective:  
> 0.023537
> http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd sodipodi guide:  
> 0.017420
> http://ns.adobe.com/AdobeIllustrator/10.0/ i pgfRef: 0.014387

Outside metadata, foreign elements are common and are used for storing  
editor-specific state that renderers are supposed to ignore.

> svg: 0.012202
> path: 0.010224

SVG elements that are erroneously in no namespace show up here.

> prefixedAttributes
> ANY: 0.570991

Lots of those--all to be ignored by renderers.

> fontAttributes
> ANY: 0.007303
> id: 0.005655
> horiz-adv-x: 0.004157
> fontVariant: 0.001604
> fontWeight: 0.001604
> fullFontName: 0.001604
> font-variant: 0.001255
> font-weight: 0.001255
> style: 0.001217
> font-style: 0.001049
> family: 0.000256
> size: 0.000256
> fontStyle: 0.000094
> class: 0.000037
> fill-rule: 0.000037
> color: 0.000012

I forgot to count font elements that don't have any attributes.

> unconventionalXLinkPrefixes
> ANY: 0.000050
> xl: 0.000037
> NS4: 0.000006
> l: 0.000006

It is *extremely* rare for XLink elements to use a prefix other than  
'xlink'.

> fontParent
> ANY: 0.007303
> defs: 0.004956
> svg: 0.001785

The <font> element does occur outside of <defs>  relatively often.

> : 0.000562

This means that the parent was not into SVG namespace.

> piTargets
> ANY: 0.006054
> xpacket: 0.006036
> adobe-xap-filters: 0.000137
> xml-stylesheet: 0.000012
> xL: 0.000006

Processing instructions pertain mainly to XMP.

> requiredExtensions
> ANY: 0.014780
> http://ns.adobe.com/AdobeIllustrator/10.0/: 0.014424
> http://ns.adobe.com/Flows/1.0/: 0.000880
> http://ns.adobe.com/ImageReplacement/1.0/: 0.000368
> http://schemas.microsoft.com/visio/2003/SVGExtensions/: 0.000062
> http://ns.adobe.com/Graphs/1.0/: 0.000056
> : 0.000006

These are for heycam.

> internalEntities
> ANY: 0.053271
> ns_svg: 0.048921
> ns_xlink: 0.048871
> ns_flows: 0.016016
> ns_ai: 0.009880
> ns_extend: 0.009880
> ns_graphs: 0.009880
> ns_adobe_xpath: 0.009331
> ns_custom: 0.009331
> ns_imrep: 0.009331
> ns_sfw: 0.009331
> ns_vars: 0.009331
> st0: 0.001017
> st1: 0.000961
> st2: 0.000868
> st3: 0.000780

Entities declared in the internal subset are mainly weird NS  
declarations emitted by Adobe Illustrator. Copying and pasting the  
part of the document after the doctype into text/html under the SVG  
WG's proposal doesn't work  without further editing. Under the  
proposal that is commented out in the HTML5 draft, the pasted stuff  
should render without further editing even though it would be  
conforming.

> creator
> Inkscape: 0.437211
> NO CREATOR: 0.349118
> Adobe Illustrator: 0.133225
> Arkyan's SVGCensus script: 0.071415
> Sodipodi: 0.008164
> Notepad: 0.000243
> svg-rocco-library: 0.000169

Inkscape and Illustrator are the most popular identifiable products.

> ********************************

> texteditor
> ----------------
> Total: 4

Per editor stats in random order.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Monday, 4 August 2008 16:00:32 UTC