- From: Steven J. DeRose <sjd@eps.inso.com>
- Date: Mon, 19 May 1997 11:39:14 -0400
- To: w3c-sgml-wg@w3.org
At 12:37 AM 05/17/97 GMT, Peter Murray-Rust wrote: >SD1 (short tags). This is trivial for parser writers, but I am no longer in >favour of it because (a) I suspect some XML documents will be hand-authored >(or at least hand-edited) - a rich source of confusion - and (b) error >recovery for missing endtags in WF documents would be effectively error >creation. Very well put. Omitting GIs from end-tags removes redundancy; and therefore inevitably reduces the possibility or error detection (see anything by Claude Shannon). Someone's idea that it allows us to avoid even having a "wrong GI in end-tag" message and that that is an advantage, is only tenable if when that error actually occurs, what one wants *instead* is a parser that silently assigns the wrong end-tag without even noticing that it's wrong. >SD2 (structured attributes). This seems to require either (a) rewriting >SGML and the parsers or (b) doing some fairly cunning pre-parser manipulation. >I cannot see a working prototype by July 1, even if we agreed the syntax >today, since there would be too many implied semantics to be resolved >quickly. It would also break the Esis and ElementTree output and although >it might be representable by groves, there are not yet any XML applications >that use groves? Note that massive changes are only required if you wish XML to *validate* or constrain such attributes. You can stuff a stream of SGML or Lisp or whatever in an attribute *right now* and have structure. I don't see your point about "too many implied semantics" -- we don't have to say any more about semantics in attributes than we do in content. For ***example***, we could easily state the syntax requirements for putting structure inside attributes, simply by using XML syntax there (gee, you have to escape quotes), or by using Lisp/Scheme/DSSSL Scheme syntax there (gee, you still have to escape quotes). Just cite either definition. This may or may not be the best thing to do, but the syntax is easy and we don't have to touch semantics. >SD3 (Data types). I don't think there are syntactic problems here, although >I suspect the namespace of potential types is sufficiently large that we >would debate for some time. (Date?, Date+Time?, DateTime?, etc. and there >will also have to be discussion about floats - precision, overflow behaviour, >format specification, etc.) I wouldn't object to a placeholder and make as >much progress as possible. I thought the proposal was to take a very small, specific set of types: the atomic datatypes agreed on in RDBs. Not a big deal. And there are plenty of standards for exactly how such things behave (if we even have to care): IEEE floating point, for example, which almost everybody uses. Just cite the existing literature. Differing from it would be hugely more work, make us gratuitously incompatible, and make us look foolish unless/until we showed superb and compelling reasons. SGML already gets flak for not having ubiquitous types like integer, real, character, and so on. If we were to add those (I'm not certain we should, just "if"), I think few people would complain that we didn't also add imaginary and transcendental numbers, polynomials, color spaces, and so on. Steven J. DeRose, Ph.D., Chief Scientist Inso Electronic Publishing Solutions (formerly EBT)
Received on Monday, 19 May 1997 11:42:16 UTC