Opinions on some of Michael's undiscussed topics

At 7:15 PM 9/23/96, Michael Sperberg-McQueen wrote:
>* Should XML use the markup-declaration syntax described by ISO 8879
>clauses 10-11, or should XML define a specialized document type and let
>its markup declarations use the document-instance syntax, as proposed by
>MGML?
>
>(Using the same syntax for declarations and instances cuts the size of
>the grammar approximately in half.  It also reflects a firm belief that
>structured information belongs in SGML documents.)

I need to review MGML, but I think this would be great, as I too feel the
desire for a meta-DTD everytime a write one. Unfortunately the one time I
tried to do one, I did a project-specific one, and so I still don't have
the general tool that I need. I agree with Eliot that we might get some
things wrong, but the risk of getting something disastrously wrong with
this group is slight, I think. And the goal of easy implementability should
steer us away from a lot of tempting problems, I think.

>* How should XML deal with the need for conditional inclusion of markup
>declarations, if XML has no marked sections (10.4.1)?

Sentiment seems to lean to allowing them in DTDs. DTD subsets are an open
problem. These could also be handled by the conditional inclusion mechanism
for instances, if we align the two syntaxes...

>* Should XML change the delimiter-in-context rules to require the STAGO
>and ERO strings to be escaped whenever they are not to be recognized as
>delimiters (9.6)?
>
>(Some people have proposed this; certainly many user manuals I've seen
>prescribe this behavior for users, rather than explaining the d-i-c
>rules.)

I wouldn't die for this, but I would love it. The contextual rules are just
too hard to remember on the fly. And, as James has noted they are another
little convenience that the parser has to care about. I'd rather we end up
with a language where we could have a real split between lexical analyzer
and parser, and eliminating d.i.c. seems to aid that.

>* Should XML use MSOCHAR, MSSCHAR, and MSICHAR strings (9.7)?
>
>(If MSSCHAR is '\', then \< and \& are escaped in the expected way.
>Unlike most Unix-style backslash escapes, however, the MSSCHAR is not
>removed from the data stream; this means \ characters in existing docs
>are safe.)

This breaks for strings like the following:
 \&entity;\
  Actually I favor the proposal, but we shouldn't expect magic for existing
SGML instances if we accept it.

>
>* Should XML require system and public identifiers to be FORMAL (13.5)?

Yes. I'm inclined even to remove system IDs altogether (even formal ones)
and leave them only in CATALOGs.

>* Should XML restrict comment declarations to a single comment (10.3)?

Sure. I can see no rational justification for multiple comments in a single
comment declaration anyway. I'd rather be stronger and allow '--' in
comments except when followed by '>'. Otherwise the single comment nature
of the markup will make the no '--' rule even harder to justify...


   -- David


RE delenda est.

--------------------------------------------+--------------------------
David Durand                  dgd@cs.bu.edu | david@dynamicDiagrams.com
Boston University Computer Science          | Dynamic Diagrams
http://www.cs.bu.edu/students/grads/dgd/    | http://dynamicDiagrams.com/

Received on Friday, 27 September 1996 13:12:09 UTC