- From: Peter Murray-Rust <Peter@ursus.demon.co.uk>
- Date: Thu, 12 Jun 1997 22:47:22 GMT
- To: w3c-sgml-wg@w3.org
In message <199706111436.KAA20852@www10.w3.org> Michael Sperberg-McQueen writes: [...] > > So the difference between processors which read the DTD and those > which don't is not validating vs. non-validating, and not conformant > vs. non-conformant. It's just processors which read the DTD vs. > those which don't. The former is a superset of validating > processors. We are still struggling with this on xml-dev. May I suggest an alteration to the parsing/reading philosophy? 2.10 rightly identifies three areas where knowledge of the DTD is required to pass 'correct' output to the application. #1 is default attribute values; #2 is entities. *IF* these are declared in the DTD I cannot see that it's reasonable for a processor/parser to ignore them. The spec states that RMD="NONE" is allowed only if such declarations don't exist; otherwise it's an error, but a non-DTD-reading parser (or one blindly following NONE) can never flag this error. Therefore the only course that seems reasonable to me is: <PROPOSAL> All XML processors must read the DTD(s) </PROPOSAL> The problem seems to arise with #3, the analysis of whitespace :-( The appropriate treatment of this depends on: "reading and analysing the element content" and I suggest that a phrase like this should is what is intended by "reads the DTD". (I think it only occurs in this context). (Whether it is possible to analyse the whitespace problem without validating the content at the same time anyway I don't know.) The consequences of this - if accepted - are that all parsers have to be able to deal with any construct in the DTD, including PEs :-). So my final analysis - shoot it down - is: 1. all parsers must read the DTD or DTDs if present and process them including: - substitution of parameter entities - handling of multiple ATTLIST declarations 1a. all parsers must add default attributes to start-tags in the document 1b. all parsers must expand entities in the document as in 4.3.1 and 4.3.2 2. a parser is validating or non-validating. 2a. a non-validating parser may/must/must_not[1] analyse an element's content and may/must/must_not apply the current rule for whitespace insertion or deletion. 3. [2] 4. [3] [1] to be determined by the ERB [2] error recovery in validation of a WF document, to be determined by the ERB [3] error recovery in WFness of a non-WF document, ditto. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/
Received on Thursday, 12 June 1997 18:14:38 UTC