- From: Dave Peterson <davep@acm.org>
- Date: Fri, 6 Jun 1997 11:16:12 -0400
- To: w3c-sgml-wg@w3.org
At 11:49 AM 6/5/97, lee@sq.com wrote: >The simplest way to implement them that I can see is to push the input >stack on %name; and to pop it on the end of the corresponding string. >This isn't exactly the SGML way. It's not? What's different? Are you thinking of the Ee? That's just a device to help describe "semantic" conditions limiting where entities can begin and end. >I therefore think that implementation would be eased if S were removed >from all productions and replaced by a set of tokenisation rules. >The input could then be considered as a token stream rather than as >a byte stream with delimiters -- that is to say, both views would be >equally conformant and correct. >People who have implemented parsers -- might that have been eaiser? In SGML, tokenization beyond the "each character is a token" level is difficult without feedback information from the state of the parser. Do we think XML has eliminated the feedback requirement? There has been some thought to describing SGML parsing as a two-stage affair, where the tokens read by the second parser are emitted by the first parser. This can be used to hide boundaries of marked sections and entities (by not emitting them from the first parser) and also to isolate the non-traditional parsing requirements in the second parser where, because they are not simultaneously dealing with all the parsing that can be done "traditionally", the construction might be easier. Yet more of the stuff that needs to be worked out for the revision. (Gasp, Pant, "Help!") Dave Peterson SGMLWorks! davep@acm.org
Received on Friday, 6 June 1997 11:16:29 UTC