RE: Options for dealing with IDs from Rick Jelliffe on 2003-01-13 (www-tag@w3.org from January 2003)

From: Rick Jelliffe <ricko@topologi.com>
Date: Tue, 14 Jan 2003 01:24:03 +1100
To: <www-tag@w3.org>
Message-ID: <003c01c2bb0f$6cd18850$4bc8a8c0@AlletteSystems.com>

(Repost from Wednesday. This never made the archive, not sure why.)

I think Chris misses out another option:

?) Refactor XML so that there are four kinds of XML processors:  headlessWF, 
    WF, typed, and valid.  Deprecate WF in favour of WF and typedWF in all W3C 
    specifications. 

   - Headless WF must have no DOCTYPE. 
   - Typed would use the DTD for entity expansion, decoration and type annotating 
    only without validating.  Typed may also include built-in defaults for standard
    entities sets.
 
  A typed  XML processor will result in all attributes of type 
  ID being so noted in the Infoset. 

  Advantages:
  - headless gives lightweight XML for those who need it, especially
   those who want to enforce no DTDs: so reflects common
   practise in some industries
  - always makes sure that documents with DTD have
    exactly the same infoset (except validation info) whether they are 
   valid or just typed
  - existing mechanism (DTDs)
  - allows small documents, with just the declarations for
    IDs in the prolog, so document sizes can be comparable
    to proposals for inline declarations
  - typed gives a form of XML that HTML can use without
    buying into DTD validation: the DTD for XHTML would
    only have ID attributes (and 
  - addresses HTML's problem, rather than farting around one
    particular issue at a time: the problem is not "we need IDs"
   or "HTML needs entities" but that "WF versus Valid is proving
   to be the wrong split."  . 

  Disadvantages:
  - existing mechanism is deemed poor, but this is partly due to
    stupidity of making DTD interpretation at whim of parser,
  - not namespace aware (though this could be fixed at the same time:
    indeed, one of the ISO DSDL schema languages is to use DTDs externally
    with namespace awareness and no syntax change: it is W3C who is
    conservative on adding namespace-awareness to DTDs.)
  - performs/conflates type annotation and decoration, though why this is
   so bad when there is a strict processing order eludes me

In other words, this should be XML 1.2.  No existing documents
would become invalid. The HTML entity problem would be resolved.
The ID problem would be resolved. The roulette Infoset problem 
would be resolved. The lack of namespace awareness in DTDs
would be resolved. The need for an official lightweight form of XML
would be resolved.

Cheers
Rick Jelliffe

Received on Monday, 13 January 2003 22:00:46 UTC