- From: Peter Murray-Rust <Peter@ursus.demon.co.uk>
- Date: Mon, 21 Apr 1997 16:08:38 GMT
- To: w3c-sgml-wg@w3.org
In message <1.5.4.32.19970421065409.0069c20c@mail.u-net.com> Martin Bryan writes: > At 10:41 20/4/97 GMT, Peter Murray-Rust wrote: [...] > > There is a difference between loss of markup and loss of data. Whilst both > consititute information, no data should be discarded just because there is > an error in a piece of markup. XML should at very least retain the following > data as part of the last validly opened element. Although I'm not an SGML expert, I take a different view, in that markup and data are both essential parts of the document. I am prepared to write the following: <?XML VERSION="1.0"?> <!DOCTYPE CML-LIKE [ <!ELEMENT CML-LIKE ANY> <!ELEMENT MASS #PCDATA> <!ATTLIST MASS UNITS CDATA "KILOGRAM"> ]> <CML-LIKE> The mass of the reactant was <MASS UNITS="grams">3</MASS> which was clearly unsafe... </CML-LIKE> 'grams' is as much a part of the document as '3'. If (and I'm not saying it's recommmended) UNITS defaults to the appropriate SI unit then an omitted UNITS attribute will be automatically interpreted as kg. The above document is WF. Without the quotes round 'grams' it is broken. It's quite conceivable that a parser would simply say 'corrupt attribute omitted'. Then the clever application (which is used to working with DTD-less documents inserts the 'kg' string. The reader must at least see a little flag saying 'broken' since it's as broken as if the 3 were replaced by 3000. There has been a presumption in some of the discussion that it's up to the authors and readers not to do foolish things in XML. The problem is that *if you don't have experience in SGML* it's incredibly easy to do foolish things unless prevented. Most people see the FPI on HTML documents and think it's a ritual (they're right - it normally is). But in XML that string _matters_. It can change your document. I understand the points of view that are being put for leniency in processing documents. However, if we are selling XML on the basis that it can control rockets, we must appear to show that we care about precision. No-one can stop the rest of the world working with broken documents, but I think we have to promote the value of XML as _supporting_ precision. In my mind: "HTML is great because you can send people broken documents" "XML is great because you can't send people broken documents" YMMV P. > ---- > Martin Bryan, The SGML Centre, Churchdown, Glos. GL3 2PU, UK > Phone/Fax: +44 1452 714029 WWW home page: http://www.sgml.u-net.com/ > > > -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/
Received on Monday, 21 April 1997 11:30:52 UTC