- From: Steve Pepper <pepper@FALCH.NO>
- Date: Mon, 3 Mar 1997 19:57:31 +0100
- To: w3c-sgml-wg@w3.org
Thanks to Jon and Tim for providing some rationale for the ERB decisions. The arguments against using LINK syntax seem to be falling into a number of distinct categories: 1) Malediction 2) Confusion caused by the very word LINK 3) "Nobody understands it" 4) "LINK can't do the job anyway" 5) Verbosity and lack of clarity of the syntax 6) Difficulty of implementation 7) "LINK is unnecessary if WG8 gives us multiple attlists" 1) Malediction -------------- Although this is currently the largest category, I sincerely hope we all agree that we should be looking for the reasons behind LINK's bad image, rather than indulging in name calling. 2) Confusion caused by the very word LINK ----------------------------------------- David, Jon and Tim have all raised this issue, and it is quite pertinent. I have already considered it and explained my solution in my reply to David. Just as a reminder: <!DOCTYPE tei.2 public "-//TEI//DTD P3//EN"> <!PROCSPEC xml-proc tei.2 #IMPLIED [ <!ATTLIST xref xml-link CDATA #fixed "xml-tlink"> <!PROCDEF #INITIAL xref> ]> I am in other words proposing that we exploit the fact that XML uses a fixed SGML declaration to change the reserved names LINKTYPE and LINK to PROCSPEC and PROCDEF respectively. An XML document will then consist of an (optional) document type definition, an (optional) processing specification, and the instance itself. 3) "Nobody understands it" -------------------------- This is the key argument, I suppose. If as Tim says "only 17 people in the world understand LINK", I see that we have an uphill battle on our hands. The first question is, is this WG prepared to _try_ to understand it before rejecting it? The next question is, is it the _concepts_ that are problematic or the _syntax_? (Another interesting question is whether those that oppose my proposal number themselves among the 17...) I don't want to waste people's time if only Sam Hunting, one (unnamed) ERB member and I think this is worth pursuing. If there ARE others, I would like to ask them to show their hands; otherwise I will shut up and go back into hibernation! 4) "LINK can't do the job anyway" --------------------------------- This was Martin Bryan's argument and it is off the mark for the simple reason that the LINK-based solution is intended to solve the problem at the document type (or element type) level, not at the level of the individual elements. (That is why my example uses a FIXED attribute -- as did Steve DeRose's original example of the so-called "ideal" solution.) Now, I appreciate that there will be a need to specify values for xml-link attributes at the individual element level in some documents, in which case we are no longer talking about algorithmically associating processing information with structure (i.e. LINK). But I still contend that there is and will continue to be a very important class of documents for which useful XML functionality can be added at the element type level. Examples of this could be something like Jon's Solaris documentation and some of the vast corpora of TEI based information. 5) Verbosity and lack of clarity of the syntax ---------------------------------------------- Jon talks about "voodoo", "ISO obfuscation at its worst", "gibberish" and "apparent nonsense" because of the two lines <!LINKTYPE xml-link tei.2 #IMPLIED [ and <!LINK #INITIAL xref> I submit that we have already accepted at least as much "gibberish" in order to keep XML SGML-compliant. Take for example <!DOCTYPE tei.2 ... Totally unnecessary if you ask me. XML only permits one document type declaration, and the document element is known as soon as we hit the first start-tag (because XML doesn't allow tag omission). So all that is needed is the element and entity declarations at the head of the document, in the manner of #DEFINEs and #INCLUDEs. (Now wouldn't that appeal to Jon's "highly competent computer scientists"!) The same goes for the <!ATTLIST gibberish and the requirement to specify the generic identifier in an attribute definition list declaration. Since XML doesn't allow name groups instead of generic identifiers, why not just put the attribute definition list inside the element declaration and save ourselves a few extra syntax tokens? I am being facetious, of course. The serious point I want to make is that I support this project because I care about SGML. I am willing to take the trouble to explain why specifying a reduced subset of a powerful language requires carrying a little extra baggage, and I emphatically do *not* accept epithets like "voodoo" and "ISO obfuscation". 6) Difficulty of implementation ------------------------------- David drew attention to the fact that link attributes have their own name space. I am not enough of a computer scientist to know how big a problem this is, but I suspect that it is exaggerated in this case. Perhaps someone more knowledgeable could provide some insight? 7) "LINK is unnecessary if WG8 gives us multiple attlists" ---------------------------------------------------------- I do not believe this is the case. (So Jon is wrong to impute that I agree with the ERB that multiple attlists are the ideal solution -- I gave the word "ideal" in inverted commas in my posting.) Why? Well, that isn't easy to explain to anyone who has not "understood" the basic point of LINK: That it is a smart idea to separate the specification of the structural relationships (the DTD) from the specification of the structure-related processing to be performed for a particular purpose (the LPD). Yesterday [1] I wanted to send my SGML document to an ICADD-aware processor. To do so, I had to add a whole bunch of fixed attribute declarations to my DTD. Today I want to send it to an XML processor and I am being told that I must pile in yet more fixed attributes in order to accomplish this. Tomorrow and in the future I will think of ever new ways to process my information: I don't want to have to revise the DTD every time I do this. Even if it were *my* DTD, I wouldn't want to overburden it with ever-increasing numbers of fixed attributes every time a new form of processing turned up. What I *would* like to do is express that processing information in a modular and extensible way and "plug in" the relevant spec at processing time. That is what LINK does. Nothing more and nothing less. That is the general argument for LINK. There is also a specifically XML-related argument for (at least) a limited subset: Assuming WG8 gives us multiple attlists, people will start using these in their internal subsets. They will soon tire of adding these declarations to every single document. There will be a tendency to build them into the DTD itself -- maybe not even as separate declarations, but as additional attribute definition lists inside the main attlist declaration for the element type in question: XML-processing attributes along with the more general structural attributes. Then one day, along comes the opportunity to deliver the document to some XML processor that doesn't require the whole DTD, just a well-formed XML document -- and, of course, the XML-processing attributes. They then have to go back to the DTD and extricate the XML-related attributes. If they are wise, they will put them in a separate entity which is referenced in the DTD but also available for direct inclusion in well-formed documents. At this point they will have discovered the advantage of separating structure from processing. They will have implemented LINK, albeit in an uncontrolled and informal way. How much easier if the formal distinction between structure and processing had been built into the XML spec from the word go, so that they could arrive directly at the correct solution instead of beating around the bush! Then they would have the flexibility to deliver: - just the well-formed instance (WFI), - the WFI and the DTD, - the WFI and the XML processing specification (LPD) - the WFI, the DTD *and* the LPD depending on the requirements of the processor. And they would have the freedom to define different XML processing specs for the _same_document_type_ and plug them in and out as needed. For these reasons, I believe the LINK-based approach is *vastly superior* to the multiple attlist approach. Finally: A NEW PROPOSAL ----------------------- So is there anything that can be done that will give us all this power, and remove the syntax-related objections raised by Jon and Tim? I think there is: Instead of lobbying WG8 for multiple attlists (OK, then: as well as doing that), we lobby for a simplification of the LINK syntax that would allow my example to be expressed as simply as follows: <!DOCTYPE tei.2 public "-//TEI//DTD P3//EN"> <!PROCSPEC xml-proc [ <!ATTLIST xref xml-link CDATA #fixed "xml-tlink"> ]> I believe this is possible. Would it bring anyone around? Regards, Steve [1] http://www.falch.no/people/pepper/link.htm -- Steve Pepper, SGML Architect, <pepper@falch.no> Falch Infotek a.s, Postboks 130 Kalbakken, N-0902 Oslo, Norway http://www.falch.no/ tel://+47 2290 2733 fax://+47 2290 2599 "Whirlwind Guide": http://www.falch.no/people/pepper/sgmltool/
Received on Monday, 3 March 1997 14:00:50 UTC