- From: Bill Lindsey <blindsey@BDMTech.com>
- Date: Sun, 15 Sep 1996 14:20:57 -0600 (MDT)
- To: Martin Bryan <mtbryan@sgml.u-net.com>, w3c-sgml-wg@w3.org
On Sun, 15 Sep 1996, Martin Bryan wrote: > > ((again) We can still use un-minimized end tags. (/again)/) > > Note that the above example is incorrect if you use TAGC on start-tags as > well as end-tags, because it should read: > > ((again)/) We can still use un-minimized end tags. (/again)/) I don't think it is incorrect. The idea is _every_ start tag uses NET rather than TAGC, unless we want to make explicit that it is EMPTY. My understanding is that an element which is begun with a NET enabling start tag may be closed with an un-minimized close tag as well as a NET. I'd rather eliminate un-minimized end tags entirely. The above example was just to show that a simple paren matching algorithm would not need any special case handling for their presence. Let me clarify my intent in the exercise. I wanted to see if one could, within today's 8879, 1) use a single character to terminate elements, and 2) enable the use of a simple character counting algorith to track your location in an XML document, and 3) solve Arjun Ray's problem with tag-soupers' spanning element tendency. It turns out that NET can be that single character, and if you use a sequence of two of its mate for STAGO, you have something that appears to meet the three criteria. As it happens, by defining a rule which say's TAGC is only used on end tags and start tags with EMPTY content, you have the flip-side of the trapeze act, and no need to alter 8879 to distinguish ETAGC and STAGC. Now, just to be a bit more radical, if we do decide that un-minimized end tags are more harmful than they're worth, and we decide it is not necessary to identify EMPTY elements, then we no longer have any use for either ETAGO or TAGC. > I, for one, am not using ( and ) all over the place! > I might consider: > > [[again]We can still use un-minimized end tags (aka [[cite]ISO 8879:1986 > .... (SGML)[/cite]/]).[/again]/] or, [[again]We can still use un-minimized end tags (aka [[cite]ISO 8879:1986 .... (SGML)]).] You are right, there are problems with just about any delimiter characters we choose. They either are too common in text, or don't offer the visual cues we'd like. (I'm not too sure that <foo></foo> really provides any visual cues beyond what we've trained ourselvees to see.) There are several ways around this within 8879, none particularly attractive: 1) sacrifice the visual cue, choosing a rarely encountered character. 2) require lots of character entity references. How about &L; and &R;? Hey, we never said we had to make it easy on the typist. 3) allow the declaration of an alternate concrete syntax. 4) invent some stupid MSOCHAR/MSICHAR/MSSCHAR trick for escaping. 5) Use some non-text characters which resemble brackets when viewed, and teach our text editors to deal with 'em. (offered by someone who wishes to remain anonymous) Any others? -Bill
Received on Sunday, 15 September 1996 16:30:33 UTC