Re: A few comments on the draft
>Perhaps the best way to do this would be to use the
>Posix regexp notation and say:
> S ::= [:space:]
>and then in the character class section define exactly
>which code points map to space.
>The productions would then all become much simpler, as they would be
>in terms of a sequence of tokens.
and the ideas preceeding this are even better. I like simple token
delimited parsing models (good for quick hacks).
>The requirement for a root seems to preclude forests.
>We've found forests to be very useful, especially in the Canadian
Ahh, but a forest is but a tree of tree after all...
>All white space should be retained at the parser level in XML,
>at least ouside of a DTD. Inside a DTD I'd really hate it if a
>parser included the S nonterminal in parse trees!
Quite! As I noted, if all whitespace is kept around, then the parser
output is the same regardless of whether one has a DTD or not. A
validator can do whateever it wants with that, and one thing might
well be to enforce SGML RE/RS handling rules, thereby producing
(assuming a valid document) a post-validator data structure exactly
the same as that produced by an SGML system. On the other hand, other
applications (formatters for example) could toss the whitespace as
they see fit.
>I agree with Gavin that the PI hack sucks.
Thank you ;-)
>> Section 4.2.2 Seems a shame to limit SYSTEM ID's
>> to URL's. The FSI backwayd compatability note
>> seemed enough to allow them...
>I don't understand this comment. seemed enough to allow what?
>URLs _are_ allowed. It's a really bad idea to prefix them with <URL>,
>as that way you can't treat the same file as containing filenames and
>as containing URLs. If SGML used only the same syntax everywhere, so
>that FSIs were attributes on elements, we could use arch forms!
No. I wished to at least *allow* FSI's, though not as the normative