- From: Dave J Woolley <david.woolley@bts.co.uk>
- Date: Wed, 18 Apr 2001 11:51:57 +0100
- To: W3C HTML <www-html@w3.org>
> From: Russell O'Connor [SMTP:roconnor@math.berkeley.edu] > > > it would probably make it impossible to parse a document > > without the DTD, which is one of the primary diffferences > > between XML and SGML. > > XML is SGML. SGML doesn't require a DTD. [DJW:] You cannot parse general SGML without knowledge of the DTD, whether or not that DTD is a formal DTD or just a narrative description. That's fairly easy to demonstrate on an HTML page with a broken document type as the W3C validator will attempt to do just that and will fail whenever it finds an element with an optional and omitted closing tag - it assumes that the closing tag was not optional and fails to close any elements subsequently opened. Another example of where SGML cannot be accurately parsed without a DTD is style and script elements in HTML, which have a CDATA content model. Without knowing that they are CDATA, the parser cannot know that it should not parse them. XML, on the other hand requires that all elements be explicitly closed and that CDATA sections be explicitly marked as such on every occurrence. Both of these are intended to ensure that you can create a correct parse tree without knowledge of the DTD (it also doesn't allow optional start tags, which is another thing that needs the DTD for a correct parse of general SGML). SGML doesn't require a <!DOCTYPE, but the parser still needs to be aware of the underlying DTD. Whilst XML is probably technically compliant with SGML, it relies on a processing instruction to allow it to have features which look a bit like SGML's default concrete syntax, but are actually radically different - e.g. XML DTDs are not valid SGML DTDs. I'm pretty sure that the sp tools can only handle XML as a result of hard coded special handling for XML, rather than simply using the DTD and specification. -- --------------------------- DISCLAIMER --------------------------------- Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of BTS.
Received on Wednesday, 18 April 2001 06:51:54 UTC