- From: Norman Walsh <Norman.Walsh@Sun.COM>
- Date: Wed, 13 Feb 2002 10:01:15 -0500
- To: Jacek Kopecky <jacek@systinet.com>
- Cc: "'www-tag'" <www-tag@w3.org>
/ Jacek Kopecky <jacek@systinet.com> was heard to say: | I'm speaking here as a relative newcomer to the depths of XML, | but I have a feeling that you wish for three things which | together contradict themselves: | | 1) maintain tight control over your vocabulary, | 2) extend it nevertheless in specific applications, | 3) validate the extended documents according to the original | tight schema. That's not actually quite what I want. I want to ignore my instructions for how the document should be processed when I'm testing the validity of the document. They aren't relevant. | Why does the specific application not validate against a | specific schema? You could get the benefit of validating the | extensions, too. Let's look at a concrete example. DocBook has <variablelist>s. They're basically like HTML DLs. Suppose I have a book that contains a whole bunch of these. I write an XSL stylesheet to produce PDF (via XSL FOs) from this book. I print it and the design department reviews it and says, "Yep, perfect, exactly what the publishing specs say. Go ahead and send it to the printer." Next, I write a stylesheet to produce HTML for online publication of the book. This time the design department says, "You know, norm, a bunch (but not all) of these lists look sortof awkward as HTML lists. Could you make them into tables instead?" Naturally, I flatly refuse. They aren't tables semantically and it would be wrong to turn them into tables in the XML source just because someone thinks they'd look prettier in HTML. And besides, even if I was willing to do that, I'd have to go through the whole print approval cycle again. I'd rather have a root canal. What I really want here is, uh, how can I describe this? What I want is an instruction that I can insert into my document that will tell a particular processor that it should do something special. I want a, wait for it, a processing instruction! So I add a few PIs to my source document: <variablelist> <?dbhtml format="table"?> ... </variablelist> I tweak my HTML stylesheet and voila, I'm finished in an afternoon. And the print stylesheets still do exactly what they should. And the design department is happy with what the HTML stylesheet produces. And I get to go home before bedtime and have a cookie because I met all my deadlines. The alternative that's most often suggested to PIs is using an element in a foreign namespace: <variablelist> <dbhtml:format-as-table/> ... </variablelist> I'm sorry, that's just not a reasonable suggestion: 1. I have $35,000 editing, content, and workflow management system that took six months to build, install, and debug that is built around the DocBook schema. You want me to make a local change to that system to support one formatting request? 2. I exchange files with 11 authors and 6 translators on 3 continents. You want me to propagate my schema change to all of them? 3. Some of the folks that I exchange documents with work for stuffy organizations that insist on industry standard schemas. DocBook does not now, nor is it ever likely, to allow random namespaced cruft. You want me to get the DocBook Technical Committee to accept a request to change the DTD to support my formatting request? (Here's a tip, as the chair of that TC, I know what the likely answer is going to be :-) 4. *Every* stylesheet that processes the document has to go to special effort to deal with or ignore the extra elements. (The stock HTML stylesheets for DocBook will turn this into <font color="red"><dbhtml:format-as-table></font>, for example.) The only reasonable answer that I see (please, don't suggest using CSS instead of a table; that may or may not be reasonable depending on the browsers involved and it isn't what the design department told me to do (and if you really wanted me to, I could come up with a similar example that isn't amenable to a CSS solution)), is to move this formatting information completely out of band. But that's a lot more work and it's a lot more fragile. The PI is entirely harmless (and invisible) to processors that don't care about it, but provides useful information for processors that go out of their way to look for it. The argument that PIs are a security danger doesn't move me at all. Anyone that implements a system that processes <?runthis cmd="rm -rf ~/"?> knows full well what door they've left open and had better take precautions. Be seeing you, norm -- Norman.Walsh@Sun.COM | A great deal may be done by severity, more by XML Standards Engineer | love, but most by clear discernment and XML Technology Center | impartial justice.--Goethe Sun Microsystems, Inc. |
Received on Wednesday, 13 February 2002 10:01:22 UTC