- From: Paul Grosso <paul@arbortext.com>
- Date: Tue, 17 Dec 96 10:05:37 CST
- To: w3c-sgml-wg@w3.org
> From: bosak@atlantic-83.eng.sun.com (Jon Bosak) > > [Chris Maden:] > > | 3) A dichotomy between "DTD-ful" and DTD-less parsing will make any > | sibling-based relationship difficult at best; this will affect some > | TEI or HyQ based hyperlinks, as well as sibling-based stylistic > | decisions. > > Sorry to be so slow here, but what's the connection with sibling > relationships? My idea of a well-formed XML document is one for which > there is just one possible tree structure; Precisely what I would hope too. That was one of the reasons behind my earlier posting on this, part of which you quote below. > what's different about > sibling relationships if a DTD is provided? If whitespace is significant (i.e., contributes to the grove) in one case (e.g., whitespace in element content that is not known to be within element content when the DTD is not available and is therefore considered to be significant) and not in the other (i.e., when the DTD indicates the whitespace is in element content), then you will have what HyTime considers to be pseudo-elements in the first case and not in the second. For example, consider: <book><section><p>Paragraph one.</p> <p>Paragraph two.</p></section></book> If <section> is known to have element content, then the <p> element whose content is "Paragraph two." is the second child of <section> whereas if the section has mixed content, that same <p> is the third child of <section>. A HyTime treeloc of 1 1 2 that should properly address the "Paragraph two." <p> element when <section> is known to have element content would instead address the " " pseudo-element when <section> is assumed to have mixed content. > From: bosak@atlantic-83.eng.sun.com (Jon Bosak) > > [Paul Grosso:] > > | If we have defined the concept of well-formed XML precisely so that we > | can deal with XML instances without DTDs, then I suggest we refine the > | definition of well-formedness to include what we might call > | "normalized whitespacing." An XML document is well-formed (and > | therefore can be properly processed without reference to a DTD) *only* > | if it contains no (non-markup) whitespace that would be insignificant > | if it were parsed with reference to its DTD. > > I rather like this idea, but what do you mean by "its DTD"? There are > an infinite number of candidates. By "its DTD" I really mean by any one of its infinite possible DTDs. In particular, a well-formed doc could never have whitespace that could be significant in any possible DTD and insignificant in any other possible DTD. It could only have whitespace that would always significant (e.g., in between two words) or that would always be insignificant (e.g., an RE immediately following a start tag or immediately preceding an end tag).
Received on Tuesday, 17 December 1996 11:09:58 UTC