Re: FPI Mythology (was: XHTML Considered Harmful)

On Sat, 30 Jun 2001, I wrote:

[ Summarizing Dr. Goldfarb on "External Subset Considered Harmful" ]

> eliminating the external subset special syntax would
> 
>  1. Simplify parsing by eliminating an unnecessary variant form.
> 
>  2. Allow a single set of rules for declaration processing, based on
>     declaration type and provenance in an external entity, without
>     any need for special casing (including the lexically non-obvious 
>     one that the "special" external subset is processed last.)
> 
> I don't recall any objections other than those based on the status of
> element declarations (which was in doubt at the time), but in the
> event, the external subset syntax remained in the XML spec. 

It may be worth pointing out that FPIs have no special status or
privilege in the XML spec.  Ignoring for the moment that the reference
to ISO 8879 in the XML spec isn't even normative (!), the SGML
declaration used in XML documents specifies FORMAL NO, so the use of
ISO 9070 syntax is at best a convention.  Also, since XML doesn't
have short references, the use of the 'DTD' public text class in FPIs
(meaning "document type declaration subset") is a misnomer, or an
overspecification at best.  See [112]-[115] here:

 http://www.oreilly.com/people/staff/crism/sgmldefs.html

: 10.2.2.1 Public Text Class
:
: [86] public text class =
:         _name_
: 
: The _name_ must be one that identifies an SGML construct in the
: following list:
:  Name          SGML Construct
:  CAPACITY      capacity set [180]
:  CHARSET       character data [47]
:  DOCUMENT      SGML document [1]
:  DTD           dcoument type declaration subset [112]
:  ELEMENTS      element set [114]
:  ENTITIES      entity set [113]
:  LPD           link type declaration subset [161]
:  NONSGML       non-SGML data entity [6]
:  NOTATION      character data [47]
:  SHORTREF      short reference set [115]
:  SUBDOC        SGML subdocument entity [3]
:  SYNTAX        concrete syntax [182]
:  TEXT          SGML text entity [4]

(Annex K has added 'SD' for "SGNL declaration body" to this list.)

From the provisions of Clause 10.2.2.1, the correct public text class
for entities with declarations is 'ELEMENTS'.

This applies a fortiori to the various HTML specs.  Perhaps
eliminating the 'DTD' TLA could go a long way in dispelling myths.


Arjun

Received on Saturday, 30 June 2001 22:33:18 UTC