- From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
- Date: Sat, 22 Mar 97 17:01:04 CST
- To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
On Fri, 21 Mar 1997 18:06:23 -0500 Peter Murray-Rust said: >In message <3332D971.68F5@utila.ifi.uni-klu.ac.at> "Norbert H. MIKULA" writes: >> As far as I can remember the ERB has initially decided >> to change the syntax for comments to >> >> <--* ..... *--> >> >> (posted to the XML-WG : Wed, 15 Jan) >> >> But the torture.xml file of cmsmcq uses <!--* .. *--> The original posting of the decision did have the form <--* ... *--> but it was a typo, corrected shortly thereafter. > ... >> What is the current state of things ? > >I share Norbert's concern about uncertainties in the XML draft and feel >that a number of us are 'stalled' at present due to one or more >uncertainties in the spec. (It may be that these are simple >misconceptions, but they need tidying up.). We agree that the mythical Tim Bray and I are at work on a revised version of the draft spec which will include fixes for the reported errors and the changes so far agreed to, for the comment syntax (adding the asterisks) and white-space handling. >grad student can hack a parser in the mythical two weeks, but only if >they have a clear spec to write to. [My own position is that I want to Right now, the standard disclaimers pasted on all W3C working drafts remain appropriate: the draft is subject to change. If a given version of the draft is clear and consistent with itself, I think we're doing OK. If implementors want currency more than consistency and clarity, the decisions we've made have all been announced, and the logs of the WG are the place to find the decisions. >My understanding is that the productions (1-77) are consistent and can >be used as the basis of a yacc-like approach (as NXP does, using JACC). >So the first question is [see Norbert's query]: > >(a) are we agreed that (1-77) in WD-xml-961114 are the current version >and that none are under revision at present? (Until Norbert's question >I had assumed that [21] (Comments) was correct). They are the current version; the current version is always subject to revision. In fact, the only changes being made to the productions are: - correction of reported errors (missing *, missing S, etc.) - implementation of the comment-syntax decision, to prepare XML for use of comment open- and close-delimiters when SGML gets them - incorporation of more of the constraints on entity replacement into the grammar (in particular parameter entities) - simplification of some productions using regular-expression subtraction. (This is a notation Tim found in some orthodox CS treatment of regular expressions, so it's not just ad hockery: (A - B) defines the language which contains all the members of A except those which are also members of B. Comments can be defined as '<!--*' (Char* - (Char* '*-->' Char*)) '*--> which is, in a slightly more formal way, almost exactly what Lee suggested long ago: anything can appear in a comment, but not the string '*-->'. Translation from this notation into that of a specific regular expression engine may require close attention to detail ... >(b) some parsing operations (e.g. entity replacement) are not described >in the BNF and are sufficiently complex or insufficiently documented to >give serious problems in implementation. It would be valuable for these >to be listed and the operations clearly defined (e.g. are comments >processed before entity replacement? are nested entities allowed? etc.) Since we'll add some things to the grammar to allow us to talk more directly about parameter entities and how they are expanded and where they are going to occur, I think we'll be able to say very explicitly when general entities are / may be / must be expanded. I don't think we should constrain the order of comment handling beyond what is logically implied by the constraints given in the section on logical and physical structure. >(c) some ancillary constructs (e.g. CATALOG) are widely held to be part >of XML (or likely to be part of XML). They are probably not too >difficult to implement if certain processes (e.g. resolution of FPIs) >are not exhaustively defined. Catalogs are not now part of XML. Whether they are likely to become part of XML is something every implementor must judge individually. I hope this helps. I see Tim has already replied, but I will send this anyway in case having the same message in a second formulation proves helpful. -C. M. Sperberg-McQueen
Received on Saturday, 22 March 1997 18:22:51 UTC