- From: Kent M Pitman <kmp@harlequin.com>
- Date: Fri, 24 Apr 98 11:43:23 EDT
- To: xml-editor@w3.org
- Cc: kmp@harlequin.com
The introductory text in section 4, Physical Structures, is very confusing. It uses a meaning for "parsed" which is alien to any meaning of "parsed" that I am familiar with. If I understand at all, after many readings, the word "parsed" could usefully be replaced by the word "XML" (or "XML entity" or "XML document"), and "unparsed" by "non-XML" (or "non-XML entity" or "non-XML document"). As nearly as I can tell from your use of "parsed", (a) it has nothing to do with the issue of whether the text has been changed from XML source characters to a structural representation of XML [the thing I normally associate with parsing]. and (b) it is both insulting to implementors of other systems, not to mention wholly confusing, to suggest that [for example] a database is not parsed. The whole point of a database is that it IS parsed--it is NOT source representation [unparsed], but a highly structured representation. - - - - - Here are some examples of confusions I had while reading this text, to help you understand why the chosen text is not good: (1) I was imagining that '<!ENTITY FOO "BAR">' was unparsed if represented as the string [character vector]: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |<|!|E|N|T|I|T|Y| |F|O|O| |"|B|A|R|"|>| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ and that it was parsed if it was represented as some structured object: +-------+----------------+ | Class | XML Markup | +-------+----------------+ | Kind | General Entity | +-------+----------------+ +-+-+-+ | NAME | +-------------------> |F|O|O| +-------+----------------+ +-+-+-+ +-+-+-+ | VAL | +--------------------------------->|B|A|R| +-------+----------------+ +-+-+-+ (2) Then I worried that maybe the "parsed" part was "BAR". That maybe instead of substituting the text vector "BAR", I was supposed to have pre-parsed that. For example, if I'd seen <DEFINE % ZAP '<!ENTITY FOO "BAR">'> that I wasn't supposed to substitute +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |<|!|E|N|T|I|T|Y| |F|O|O| |"|B|A|R|"|>| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ for %ZAP; where it occurs but I was instead supposed to substitute +-------+----------------+ | Class | XML Markup | +-------+----------------+ | Kind | General Entity | +-------+----------------+ +-+-+-+ | NAME | +-------------------> |F|O|O| +-------+----------------+ +-+-+-+ +-+-+-+ | VAL | +--------------------------------->|B|A|R| +-------+----------------+ +-+-+-+ But that didn't make sense because some objects can't be parsed without knowledge of their context and parameter entity definitions contain no notion of the content of their expansion. (3) For a while, I also worried that "PEReference" meant "Parsed Entity Reference" until I (fortunately) found mention of a "Parameter Entity Reference". I *really* do not like cute little two-letter unintelligible abbreviations, like PE, and would prefer definition [69] (and its callers) refer to ParamEntityReference, not PEReference. ("cp" is another two-letter abbrev that annoyed me; my memory of SGML says it should be "content particle" but I use other systems where it means other things like "command processor" and using a short name encourages that confusion). - - - - - Here is what I *think* the section in 4. Physical Structures is trying to say: [By the way, I find the remark in the first paragraph about how the external dtd subset is not identified by name to be confusing. If it's external and it has no name, how can it not be identified by name??] ============================================================================== 4. Physical Structures ... Entitites may be either XML documents themselves, or documents of other kinds not intended to be parsed by XML. An XML document's contents are referred to as the `replacement text' for the `entity name' that names the XML document. A non-XML entity is a resource whose contents are either not text or, if text, are not to be interpreted as XML. Each non-XML entity has an associated notation, identified by name. Beyond a requirement that an XML processor make the identifiers for the entity and notation available to the application, XML places not constraints on the contents of non-XML entities. XML entities are invoked by name using entity references; non-XML entities are invoked by name, given the value of ENTITY or ENTITIES attributes. ... ============================================================================== By the way, I think the ", see below," in paragraph 1 of Physical Structures to be visually confusing and not helpful. Also, immediately following, I don't understand why an "external DTD subset" is not referred to by name. How can anything external ever be addressed if not by name? I tried to find a definition of "external DTD subset" which answered this question usefully, but found nothing really helpful.
Received on Friday, 24 April 1998 11:40:07 UTC