- From: Joe English <jenglish@crl.com>
- Date: Wed, 23 Oct 1996 17:16:27 -0700
- To: w3c-sgml-wg@w3.org
lee@sq.com wrote: > So we can use <e/> and understand that existing SGML parsers will break, > but w'll try and fix that by changing SGML and hope that at least some > of the commercial SGML parsers are updated -- maybe most of them. > (or can this be done without breaking things? I don't see how, although > if James said it could, he is right, and I have just forgotten!) > (oh, OK, a shortref or datatag mapping to > perhaps?) Full-featured SGML parsers (i.e., SP) can be tricked into accepting that syntax by specifying "/>" as the NET delimiter and setting SHORTTAG YES in the SGML declaration. Thus this: <e/> gets parsed as STAGO, generic identifier, NET, and since 'e' is (presumably) an EMPTY element the next NET's not necessary. I think the syntax is OK from an aesthetic standpoint, but there are a *lot* of things that can go wrong.... First off it requires SHORTTAG YES, which means that all the other SHORTTAG features (that we have already decided we do *not* want) get enabled. More importantly, if somebody feeds an XML document to an SGML parser and forgets to supply the right SGML declaration -- or tries to use SGMLS for that matter, which won't let you change the RCS delimiter strings -- she'll end up with a spurious ">" after every EMPTY element. Or if somebody (naively or accidentally) types <e/> for an element that is not in fact EMPTY, his parser stack will get severely out of whack. My current preference is <@e>, which requires @ being added to NMSTART and an XML application convention for naming EMPTY elements. This would be much more robust. (SGMLS won't let you do this either, but it could be hacked to do so more easily than it could the "NET=/>" trick. SP can handle either solution. I don't know about other parsers.) --Joe English jenglish@crl.com
Received on Wednesday, 23 October 1996 20:16:02 UTC