- From: Bert Bos <bbos@mygale.inria.fr>
- Date: Sun, 11 May 1997 19:29:57 +0200 (MET DST)
- To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
Tim Bray writes: > At 06:55 PM 5/7/97 CDT, Michael Sperberg-McQueen wrote: > >On Wed, 7 May 1997 06:41:14 -0400 Peter Murray-Rust said: > >>This is doubtless not news to any of you, but it's a shock to me, that > >>WF documents and validated documents ***GIVE DIFFERENT OUTPUT***. I > >>am sure that this will be a rich source of confusion. > > > >Yes, it will. What may not be obvious is that despite that confusion, > >this behavior really was the best available at the time. > > Actually, it's worse than Peter thinks. There are at least three ways > in which DTD-less and DTD-ful processing can produce different > results: > > 1. White space in element content That is easy to fix by selecting a single whitespace handling method in the XML profile for SGML. `Keep-all-whitespace' is ugly, but workable; a better rule is be to simply ignore any newline directly after a '>' or directly before a '<'. The important thing is that this rule becomes part of the XML profile, and does not depend on the XML document itself. > 2. Default attributes The previous XML-lang draft had a handy macro <?xml default...?> that I find very useful, expecially in dealing with XML-link, where a lot of elements have fixed attributes. Without it, a document like this <!doctype foo "foo"> <foo/> with this DTD <!element foo any> <!attlist foo att (def) def> is not valid (for some definition of "valid"), since the DTD says that the "att" attribute cannot be #implied. Note that it could be omitted if this was SGML-1986, but in XML it cannot. > 3. Attribute values that are space/case normalized only if you > read the DTD and know they are NMTOKEN or ID or something. This is another thing that will have to be added to the XML profile for SGML: all attributes are always treated as CDATA and never normalized. NMTOKEN, NUMBER, etc. can still be used for validation, but do not influence the parsing. I.e., in the XML datamodel the attributes foo="7" and foo="07" are different, even though some application may treat them the same. Bert -- Bert Bos ( W 3 C ) http://www.w3.org/ http://www.w3.org/pub/WWW/People/Bos/ INRIA/W3C bert@w3.org 2004 Rt des Lucioles / BP 93 +33 4 93 65 77 71 06902 Sophia Antipolis Cedex, France
Received on Sunday, 11 May 1997 13:30:38 UTC