Re: the return of the Public Identifier Question from Paul Grosso on 1997-03-19 (w3c-sgml-wg@w3.org from March 1997)

From: Paul Grosso <paul@arbortext.com>
Date: Wed, 19 Mar 97 17:06:02 CST
To: w3c-sgml-wg@w3.org
Message-Id: <9703192306.AA05827@atiaus.arbortext.com>

> From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>

> 4 If an external id contains both a system identifier and a public
> identifier, the XML spec might specify which to try first, when to try
> both, etc., or it might leave such things unspecified.  The possible
> policies appear to be these:

>   c Public first, then system (if the public id is not found in the
>     catalog).  One vote for this.
> 
>   d Implementations may choose which to try first, but if the first
>     ID it tries fails, then the implementation should try the other
>     one.  I.e. implementations may *not* say "If both a PUBLIC and
>     a SYSTEM identifier are given, the XXXXX one is processed and
>     the YYYYY one is ignored."  Strong support for this view.

For completeness, there is another choice, which is the one TR9401 takes:

g Don't dictate in the standard which one to use first, but require
  that all implementations provide a user-settable option/switch/whatever
  that indicates which one to try first.

Since (d) allows me to write an implementation that tries PUBLIC
first and then SYSTEM, then I'm happy with (d) from an implementor's
point of view (because I think PUBLIC should be tried first).

But from an author/user's point of view, I think I have a problem
with (d).  This is how I see it:

1.  If the author wishes a specific object to be referenced, she
    will just use SYSTEM and not PUBLIC.

2.  So by specifying both SYSTEM and PUBLIC, the author is suggesting
    it's okay to give the recipient control over resolution.  

3.  But, the only way I see that the recipient has control over
    resolution is if the PUBLIC identifier is tried first (since there
    is no indirection of system ids unless XML requires use of a
    catalog that implements the SYSTEM entry type, which no one seems
    to be suggesting).

4.  If I want to use my version of an entity--I think we all agree
    there are lots of good reasons for that, such as my version has SDA 
    attributes or I want to use my set of character entity mappings--I
    use whatever mechanism (such as a catalog) to point the PUBLIC
    identifier to my storage object.  But what good does that do if
    the SYSTEM identifier is tried first?

The idea of trying both in any order is fine if all you're trying to
do is provide alternative resolutions to *equivalent* storage objects;
that is, you are just trying to avoid "broken references," and any
successful reference is equally acceptable.  But I do not see option
(d) as working if you agree that an important reason for PUBLIC ids
is to allow recipient/user level control over resolution.

For simplicity, I prefer option (d), though option (g) does also address
the requirement to allow the end user to have control over resolution.

paul

Received on Wednesday, 19 March 1997 18:11:01 UTC