- From: Peter Murray-Rust <Peter@ursus.demon.co.uk>
- Date: Sat, 31 May 1997 18:57:19 GMT
- To: w3c-sgml-wg@w3.org
In message <339050AB.549B@hiwaay.net> len bullard writes: [...] > > this because, I believe, somebody suggested dropping [PEs] at all). > > I have. I did as well. (With some trepidation, because I'm enormously appreciative of what Norbert has done. OTOH, we all know that the ERB proposes and disposes and that the July1 release of the specs *will* be different from what we are looking at now.) > I do because under the murky requirements, no one can > state a strong enough case for using them. At some point, the > *expense* of using SGML is trivial next to the cost of trying > to maintain an unstable code and content set based on a quickly moving > specification. The content vs features cost curve is making I think Tim's request [for partial dispensation from PEs] mirrors the fact that: - it is difficult to define precisely what the syntax of PEs is. - their implementation may be (slightly) error-prone, due to this. I do not believe that - looking at the current spec (Lang970331) - a competent programmer *unfamiliar with SGML* will unerringly write a parser that treats PEs 100% correctly. For the newcomer to the spec, productions such as [45] and [53] are not trivial. I don't feel these are consistent with Goal 4 (It shall be easy to write programs...) > the Web a very difficult place to invest as it remains a > caveat emptor market. This is a concern more serious than the > maintenance of a few overly complex DTDs that themselves > can be redesigned for a system without PEs. IMO, this DTD > maintenance issue is very much overrrated when compared > to the insertion of the technology into the market. The rest of my post will re-emphasise that we must constantly guard against rocket science. I believe that if/when XML is successful, >90% of the developers will never have developed SGML applications before - and I try to speak for that constituency. Personally I believe that the DTD will be of less value in XML than the SGML community is accustomed to and that it will neither be understood or required by a large number of applications and developers. CML probably falls into this category and some of the considerations from there may be relevant here. I am revising CML in the light of XML (and feeling very positive about both). The intention is to publish a new release in about a month. CML was developed with traditional (self-taught) SGML, but involved a lot of hairiness in the DTDs. CML included HTML2.0, created a large number of content models and name groups all defined as PEs and read in from separate files. All of this was managed by a CATALOG with many components. To someone who was not highly SGML-literate the formal spec of CML V1.0 is impenetrable. In the revision (V1.1) the following has emerged: - it is impossible for me to put any important structural constraints on CML documents. Therefore one of the elements has a content model of ANY. An alternative approach is my increasing use of XML-LINK=SIMPLE to transclude information. - it is impossible to constrain the possible attribute values by a DTD. So the complex name groups in 1.0 are now collapsed to CDATA. All verification is semantic rather than syntactic and is linked to the existence of (XML-based) glossaries in human-readable form. - a significant part of V1.0 is now directly covered by XML-LINK and (assuming it flies) XML-TYPE. This makes me very happy because CML gets a lot simpler, and the XML community as a whole works out the generic information objects. - an increasing amount of CML will be formally supported by other disciplines - firstly MathML, and hopefully CGM and other standards. A CML document will almost always include some information objects from another 'DTD'. Therefore it is either unvalidatable against a DTD or the DTD is so forgiving validation is no big deal. - most of the validation/processing is done outside the parsing process. IMO there is a real need for XML to address semantic validation (both XML-LINK and XML-TYPE require this anyway). Formalising this for implementers would be a great help. Because of this XML validation of CML documents is less important than semantic validation. There will be a CML1.1 DTD, and I hope it's general enough that it allows any reasonable document to be created, but will detect really gross errors. But I'm extremely wary of forbidding someone doing something they find useful - they'll do it anyway and switch off validation. A similar theme is shown in MathML where there is an attribute OTHER. This can have any value, including new attributes and values - the MathML authors hope it will be used wisely :-). I am not suggesting that the DTD concept should be jettisoned from XML, nor validation - but it represents the limit of what can be included under 'easy'. The point is that DTDs for XML can be constructed and tested for XML with free standard tools. They are unlikely to change very frequently (if they do, the community has a lot of versions for their documents to contend with). XML has a great deal that is expected of it. IMO the only way that it can manage it is to be as modular as possible and for those modules to be as simple as possible. Obviously they must have well-defined APIs (or at least clear terminology) for intercommunication. IMO XML-LINK will be harder than XML-LANG and it's still not clear what freedom the implementor has or should have. After that we have XML-STYLE, XML-TYPE, etc. Unless these are developed with interoperability in mind, a full XML implementation will be a costly business if individuals have to do the lot. So I would endorse Len's call for simplicity and PEs would be a good place to look critically. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/
Received on Saturday, 31 May 1997 17:08:10 UTC