- From: <Michael.Goulish@SoftwareAG-USA.com>
- Date: Tue, 23 May 2000 11:57:11 -0400
- To: xml-editor@w3.org
- Cc: Mike.Champion@SoftwareAG-USA.com
- Message-ID: <B48FCF558294D311ADD90080C8FAF3F85064AE@sunshine.ptg.sagus.com>
Greetings to the XML-Editor! I recently implemented a parser for the full XML grammar in C. I may be unusual in that I had no experience in XML when I started this project, but over 15 years experience as a full-time programmer and before that an MS in computer science. I thought you might be interested to hear about which parts of the XML 1.0 spec confused me the most. (I reserve the right to find other parts confusing in the future.) 1. Not all the productions belong to the grammar. ------------------------------------------------- In my world, grammars have a single start symbol. If you represent a grammar as a tree, you *always* see a connected tree. That means you can start with the start symbol and, through some series of steps, reach any other symbol in the grammar. Any symbol that's not reachable in this way can be (and should be) discarded. Starting from production "[1] document" I believe that the following symbols are unreachable in the XML 1.0 grammar: [6] Names [8] Nmtokens [30] extSubset [33] LanguageID [78] extParsedEnt [79] extPE I believe that, if the errata are taken into account (and they should be rolled into the main document instantaneously) then all of these productions are used at least in Validity Constraints. But then -- they're not part of the grammar in the same sense that the other productions are, and as their membership in the numbering scheme would seem to imply. It's odd and confusing to not be able to understand the grammar on at least a purely syntactic level without reading the accompanying prose. I would like to see unreachable symbols clearly marked in some way -- perhaps given a different numbering scheme to show that they are not part of the "main" grammar in the same way as other productions are. Maybe like VC-1, VC-2, etc. 2. There is no number 2. ---------------------------- ( I guess I'll limit this to my main point for now. Maybe more later. ) Thanks very much for your attention, and I'd be very interested to hear your thoughts -- -------------------------------- Mick .
Received on Tuesday, 23 May 2000 11:56:59 UTC