XSL pains from Attila Torcsvari on 1998-08-28 (xsl-editors@w3.org from July to September 1998)

From: Attila Torcsvari <arcdev@mail.matav.hu>
Date: Fri, 28 Aug 1998 15:07:11 +0200
To: "'xsl-editors@w3.org'" <xsl-editors@w3.org>
Message-ID: <01BDD295.89439670@wingate>

Dear Editors,
I look at XSL as a data preparator, content and search engine developer. I was happy to read the XSL draft, however I got the feeling that I miss some information which I need for the complete understanding.
The transformation language is really cute, and I would like to use it or something similar for standardizing search key generation.
I hope so my comments are valuable and do not waste your time. Please warn me if the comments are useless and I should not bother you.
Sincerely yours,
Attila Torcsvari
Arcanum Development
tech. manager

I really do not understand...
1. XML Transformation Language (XTL) and Formatting Language
In my view these are two distinct things: transforming the doc.tree to another one and adding formatting tags. (If the two steps are done at once in a browser? fine, but that might be a purely technical decision.)
I would be a happy XS(T)L user for generating index keys for search engines, for creating catchword (quick index-like) or even generating HTML (until there is no browser for XML). As well, it could be used for XML to XML applications, which I do now with specific programs.
Another use of the XTL would be adding formatting tags, which are specified in _another_ standard (the XSL itself).
I did not understand the 'fo' prefix as well; it looked too close to me to 'foo', and, in fact, the 'fo' is the 'xsl' and 'xsl' might be 'xtl'. Moreover, there is no 'x' in it. (Hm. Beyond the barriers of namespaces: conventions.)
What was the reason to have these distinct things into one standard? Wasn't it because DSSSL did it so?
2. Attribute repetition vs. inheritance
The text of the standard in the formatting language part redundantly repeats a number of attributes, which could be simplified using inheritance-like description.
(You might think, I would prefer a more object oriented description, but I would be happy even with a %;-like solution.)
For example, all background-related attributes are repeated n times, whereas it would be easier for the reader to know, that a particular formatting tag is has background attributes, and describing separately what background properties are. Moreover, it might help for implementors of the browsers to deal with the standard.
2/3 of the standard describes formatting objects, which, for me, is practically unreadable because of the redundancy.
Comment: I had the same two troubles with DSSSL standard.
3. XSL vs. XPointer : 2.6
I do not really understand why XSL repeats and reconceptualizes things described in the XPointer standard. If XPointer is not good enough, why do not you change that one as well (also in draft state) and refer from the XSL standard to XPointer?

Further comments
4. String equality vs. regexp matching : regexp
For me it seems to be evident that instead of pure string equality everywhere regexp matching should be used when testing an attribute etc. Not too big load on parsers.
Example: for all H[0-9] I want to set up the same format etc.

5. PCDATA and patterns : pattern-text-target
For me it is also obvious that particular formatting might depend on the content of the PCDATA of a node (down to PCDATA of any of its children). Example: I work for Boba-Bola and I want to see all paragraphs with BLUE background, which contains somewhere Bebsy Bola.
6. Embedded xsl:template vs. xsl:choose
For me it seems to be more efficient to implement multiply embedded xsl:template-s instead of flat templates and adding ifs and whens, although 'if' and 'when' constructs are unavoidable.
The subtree could be handled with the sub-templates, which will be more important than anything above. I do not think so that this bitters the life of implementors too much.
7. Autonumbering: overriding
Our company processed some (hundred) megabytes of legal data, and it turned out that automatic numbering used to fail when formatting text of laws. (These texts are fixed and must be 100% the same as the paper, thus the data masseur is not free to add autonumbering simply.)
The usual story is that they insert new articles between two regularly numbered article. They do it like:
11. 11a 12
or
11 11a 11aa 11ab 11b 12
or even worse
11 11bis 11ter 11ter.111quater ... 11quaturdecies 12
where, you can imagine, the bis,ter etc. can be Italic or superscript etc.
(11bis-11quaturdecies might have additional attributes in XML markup, of course.)
For such cases I do not find a help in the specification of autonumbering in XSL: how to cut the sequence and insert, with the same indentation, add an arbitrary prefix?
How can I specify <DL>,<DT>-like structures in XSL, which still will be rendered the same way (indentation) as autonumbers?
How can I restart numbering from 12 once the sequence is fallen out?
8. Automatic numbering: format of the autonumbers
I did not find the possibility for right (hm., centered?) justification of the autonumber, which is widely used for Roman numbers. Is it always hanging (not first line, or any other) indented?
9. External elements: select-function in 2.7.6
At the moment I develop an XML application which acts on single 15 MByte XML files (around 20), which are similarly structured. (see samples in HTML at http://www.arcanum.com/patclass/index.htm or http://www.wipo.org
It has many cross-references (40000).
I want to cut the file smartly, to smaller XML files, then I will have external references.
I can use XPointer and XLink, but when I put together the data for the user to see a particular paragraph, I want to help him to see a help text about the actual reference. (Like in <A title="">.) This text is not practical to include into the actual document, because then the size of the documents are doubled => during rendering I want to collect this data. (Obviously it might be used locally or in Intranet only.)
10. String constants
What is a local constant? Anything defined for a template?
Are there not other "local" definitions? Macros? Further templates? etc.
11. Stylesheet inclusion/exclusion
In what interaction are these with user/browser stylesheets like CSS?
12. Preserving position
Recent trouble in xml-dev to check two documents for equality if they were reordered by XSL.
Also, if search keys are generated for indexing purposes, highlighting needs fine positioning as well. (Example: Smith, John vs. John Smith: when I find 'Smith' in the file, I would highlight it during presentation using a stylesheet which reorders the document; how to tell to the browser what to highlight?
Is there no automatic way of preserving the position in the source tree?

Received on Friday, 28 August 1998 09:07:46 UTC