FPI Mythology (was: XHTML Considered Harmful)

On Wed, 27 Jun 2001, William F. Hammond wrote:

> [...] where I understand "tag soup" to mean HTML without a
> document type declaration.

The document type declaration has nothing to do with Tag Soup except
perhaps as just another "tag" of sorts.  (Witness the cretin at
Microsoft who tries to explain the "!DOCTYPE element"

http://msdn.microsoft.com/workshop/author/dhtml/reference/objects/doctype.asp

and remarks that it "does not require a closing tag".  Sigh.)

Tag Soup is just putting "commands between < and > signs".

> There is no suggestion in any spec I know that a given FPI should
> have more than one method of construal in an SGML or XML parser.

I think you've misapprehended the "construal" involved.

> (Arjun: were you seriously suggesting that a given FPI used in a
> document type declaration to refer to an external document type
> definition, with, for the sake of discussion, no internal subset,
> can be used with more than one SGML declaration?)

Yes.  There's nothing unusual in that.  All that matters is whether
the markup declarations constituting the contents of the external
entity are intelligible given the provisions (roughly, the SCOPE,
SYNTAX and FEATURES sections) of any specific SGML declaration.

On p.450 of _The SGML Handbook_, Dr. Goldfarb starts the commentary on
Clause 13 "SGML Declaration" with this:

:  The SGML declaration contains instructions to the SGML parser that
:  are independent of the document types and the link processes.

The SGML declaration precedes the prolog of a document, and thus
is processed *before* a document type declaration is encountered,
never mind that a document type declaration (a) may not be present,
and (b) even if it is, need *not* have this FPI construct at all.
  
For one thing, the FPI does not refer to a document type definition,
only a declaration subset (i.e. a collection of markup declarations).  
The 'DTD' public text class stands for 'document type declaration
subset', not 'document type definition' (ISO 8879, Clause 10.2.2.1).

Nor is the construct referential in intent, i.e. that it suffices
merely to recognize it in order to "know what's going on".  It is
syntactic shorthand for the inclusion of the contents at the end of
the declaration subset by dereferencing a parameter entity.  That is,
this 

 <!DOCTYPE foo PUBLIC "-//Whoever//DTD Whatever//EN" >

is by definition shorthand - and **only** shorthand! - for this

 <!DOCTYPE foo [
    <!ENTITY % bar PUBLIC "-//Whoever//DTD Whatever//EN">
    %bar; ]>

(See, e.g., the commentary on p.402-403 in _The SGML Handbook_ for
Clause 11.1.)

I really wish the mythology of the FPI usually found in document type
declarations being special in some mystical way would die.  

 http://www.oasis-open.org/cover/n1957Note.html
 http://groups.google.com/groups?as_umsgid=34E9CBC9.401B6BB0@isogen.com
http://groups.google.com/groups?as_umsgid=29vv3tgnic33pnuf4nnsm4ml21o6rjaih8@4ax.com

(Personally, I think Dr. Goldfarb was right when he proposed "External
Subset Considered Harmful" for XML.  This message is not publicly
available because the W3C continues to keep the XML-SIG archive under
wraps in the Members Area of lists.w3.org.)

One final point:

> For example with SP one uses a catalog.  The catalog may be
> specified as an argument to SP.  Each catalog points to an SGML
> declaration.  

Not necessarily.  The SGML declaration can be provided as an entity in
the input to SP.  There is *no* necessary association between SGML
declarations and particular declaration subsets in external entities.


Arjun
  

Received on Wednesday, 27 June 2001 23:47:06 UTC