Re: Element Attribute and Property tables for Integration Specification [ACTION-2707]

Hi Doug,

After much mud, fret and beers I've devised an Perl script which is epic! As a 
side note, it makes a nice HTML attribute table as well.

Doug Schepers wrote:
> Hi, Folks-
> 
> I tried (again) to publish SVG Integration, but after 4 hours or so, the 
> tables once again defeated me.
> 
> Both tables have many broken links, and here are a few of the problems 
> with the attribute/property table:
> 
> * Some attributes are duplicated over different rows... for some 
> attributes, each element has a separate row for SVG 1.1, while these are 
> normally correctly collected together for SVGT1.2 (this leads to 
> duplicate row ids, and thus invalid documents unusable as a spec that is 
> meant to be referenced)

The new script handles this case and will only duplicate an attribute row if it 
finds an attribute with two different links. In this case the id for each 
attribute is the attribute name and the fragment of the spec link joined 
together with an underscore '_' i.e "[attribute name]_[section fragment]".

> * SVGT1.2 had the wrong element name and link for <linearGradient> 
> elements (about 20 times)... it linked to the <line> element instead... 
> this may be a problem with the schema

This problem was due to the old script. The new script knows which spec each 
link has come from it does this by maintaining tables in memory that pairs up 
names and links.

> * there are no properties at all, just attributes

Ummm, yeah sadly this is still the case... BUT don't panic, the architecture of 
the new script allows properties to be generated in the HTML output as well. We 
should discuss this format and how you want it to look at the next telcon.

> * lots of problems around the xlink:*/xml:* attributes (damn namespaces!)

The new script preserves all xlink:* and xml:* attributes (<3 namespaces)

> ** all of the xlink:* attributes are missing their text content for the 
> attribute name column (for SVG11) or just have "xlink" (for SVGT12)

Text content? I'm probably too tired to realise what you mean, so you'll have to 
explain this to me at the next telcon please. In any case, the xlink name 
problem has been fixed as mentioned above.

> ** xlink:show has weird duplication/aggregation pattern for SVGT12

The new script eliminates this weirdness duplication/aggregation pattern for SVGT12.

> 
> I've tried to correct the tables manually, so we can publish it as-is 
> this time, but couldn't get it done in time.  Maybe we can look at the 
> process for how these are generated and fix it for the next attempt. 
> Doing it manually is just silly, but I didn't know the script at all, 
> nor where the data is mined from.  The Perl script had no comments and 
> no readme, so I didn't even try to fix it.
> 

Doing it manually now is even more silly given we have the new script. Despite 
having written the script I hardly know it either - it's amazingly big... just 
kidding.  The data is currently minded from the definition.xml files in the 
master directories of each of the specs.

> Since this is not just a one-off (I expect this document to be 
> maintained over time), we need to be much more systematic about this, 
> and to document the process so we any of us could pick it up.
> 

Yes good point. I will put some instructions up on how to use the script. But 
there isn't much to it - just execute on a command line "perl 
attribute_table_merge.pl" and... BAM! instant HTML attribute table.

> Here are a few questions that would help me understand the process better:
> * Where is the data coming from?
> ** Was it scraped from the spec, mined from schema or DTD?
> ** Can we format that data better for reuse?

definitions.xml files in the master directories. Not sure about formatting for 
better reuse. The definitions.xml is pretty good for specifying the links to the 
different parts of the spec.

> * How do the scripts work?
> ** What steps are involved?
> ** I take it that there is a script that creates the data for each of 
> SVG11 and SVGT12, then another script that collates them together?
> 

It's all in one now and very simple to use. The HTML output even passes the W3C 
HTML validation page (well, tentatively with 3 warnings - no char encoding, no 
doc type, I forget the third one). If you get a chance you should pub rules it 
and see how it goes.

Note, I have noticed that there are some issues with wrong or missing links to 
specs in the defintions.xml file that need to be fixed. The new script is aware 
of some of these problems and attempts to produce valid HTML output when they 
are encountered.

Off the top of my head, things left to do:
  - Wiki page with some details
  - Properties output in same HTML output
  - Investigate element output in a separate HTML output file
  - Coffee making function

Anyway, enough banging on about the script. If there are any problems or fixes 
of any sort let me know.

Thanks,

Anthony

Received on Thursday, 21 January 2010 09:36:30 UTC