the return of the Public Identifier Question

Executive summary:  ERB will take up the question of public identifiers
next week.  Current leaning appears to be toward
  - including syntax for public ids
  - defining a resolution mechanism which all conforming XML applications
    must support (non-exclusively) -- if not in the next draft, then in
    a future revision
  - not specifying the order in which applications should try
    the public and the system ids, but requiring that if the first one
    fails, the other one should be tried.


During its meeting today (19 March 1997), the ERB discussed (among other
things) the preparation of a corrected draft of the XML language spec in
time for distribution at the Web conference next month.

In particular, we agreed to vote, next week, on the issue of public
identifiers and their inclusion in the XML-Lang spec.  This issue was
discussed in the WG when the subcommittee draft was released on 31
January, but without reaching an absolutely clear consensus.  The ERB
discussion (and an informal straw poll) made fairly clear that the ERB
is leaning to the position(s) described below.  Those with an interest
in this topic have a week to confirm the ERB in its leanings, or make a
conclusive case for another solution.  (Simple reiterations of arguments
already made do not, however, qualify as a conclusive case.)

1 There is strong but not unanimous sentiment for changing the syntax of
external identifiers in XML to allow public identifiers.  Some on the ERB
would strongly prefer that a resolution mechanism be specified as well,
but at least some of the pro-resolution camp are willing to add the
syntax even if no consensus can be reached on a resolution method.

2 There is also a general leaning toward the view that if public
identifiers are included, a resolution mechanism should also be defined.
(Pro:  an implementer can read the spec and know what is involved in
supporting it.  Con:  there is no currently accessible resolution
mechanism that appears to command consensus, so there is nothing ready
for inclusion in the XML language spec.)

If we can agree on a suitable resolution mechanism, we'll include it
in the revised spec (see below for an explanation of why this seems
unlikely); if we can't, we'll include the syntax in the spec anyway,
with a note that work on a resolution method is continuing.

3 There appear to be three approaches to resolution that command
or could command non-negligible support:

  a SGML Open Catalogs, as specified in the current version of the
    relevant SGML Open technical resolution
  b a simplified form of SGML Open Catalogs, not necessarily that
    proposed on 31 January by the subcommittee
  c reliance on URN resolution mechanisms

With regard to these, the ERB leanings appear to be:

  a Support for full SGML Open catalogs is probably more work than
    should be demanded of XML implementors; the relevant TR should
    probably be mentioned as a relevant standard, but not incorporated
    in full.
  b The ERB is leaning toward including a suitable simplification of
    SGML Open catalogs in the XML-Lang spec as a required
    minimum for XML implementations.  Conforming implementations may
    support additional resolution techniques as well, but should all
    support at least this one.  However, there is no consensus that
    the current subcommittee draft hits the right note here, and it
    seems likely that the draft of 31 March (or whenever) will have
    just a promissory note.
  c URNs will be a plausible mechanism to consider when they are
    complete, but this appears not yet to be the case.

4 If an external id contains both a system identifier and a public
identifier, the XML spec might specify which to try first, when to try
both, etc., or it might leave such things unspecified.  The possible
policies appear to be these:

  a Forbid this combination:  one or the other is allowed, but not
    both.  A minority of the ERB favored this; others in the ERB
    felt they could live with it, but favored another approach.

  b System first, then public (if the system id 'fails', whatever
    an implementation decides that means).  No support for this.

  c Public first, then system (if the public id is not found in the
    catalog).  One vote for this.

  d Implementations may choose which to try first, but if the first
    ID it tries fails, then the implementation should try the other
    one.  I.e. implementations may *not* say "If both a PUBLIC and
    a SYSTEM identifier are given, the XXXXX one is processed and
    the YYYYY one is ignored."  Strong support for this view.

  e Leave unspecified, as in 8879.  No support at all for this

  f Leave for future decision.  Minority support for this.

If anyone on the WG has reached enlightenment on these issues in the
time since they were last discussed, please share your light with the
rest of us.

-C. M. Sperberg-McQueen