Re: the return of the Public Identifier Question from Michael Sperberg-McQueen on 1997-03-20 (w3c-sgml-wg@w3.org from March 1997)

From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
Date: Thu, 20 Mar 97 09:23:41 CST
To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
Message-Id: <199703201608.LAA13707@www10.w3.org>
On Wed, 19 Mar 1997 21:50:35 -0800 Terry Allen said:
>This spec constrains processors/parser, posits applications, but
>in its current version makes no requirements on apps (have I missed
>something?)  BTW, these are not rhetorical questions I ask; I really
>do expect responses, particularly from SGML ERB members.  Response
>to public comment builds credibility and legitimacy.

The error is mine.  My use of the word 'application' was careless and
(I now see, in Terry's response) misleading.  It is the XML
processor (the target, as Terry points out, of all the constraints
in the current draft spec) which may or may not be required to
support public identifiers and their resolution, or at least
indirection.

>| 2 There is also a general leaning toward the view that if public
>| identifiers are included, a resolution mechanism should also be defined.
>| (Pro:  an implementer can read the spec and know what is involved in
>| supporting it.  Con:  there is no currently accessible resolution
>| mechanism that appears to command consensus, so there is nothing ready
>| for inclusion in the XML language spec.)
>
>It is not clear here whether the specified resolution mechanism
>is a fallback or must be applied first.   Paul has suggested that
>not saying is the best policy; that's okay by me, but the XML spec
>must say what is conformant behavior.

Correct:  it's not clear in my posting because the ERB didn't get to
that level of detail, and correct:  the spec needs to say what is
conformant behavior.

My two cents:  we want the ability to have all XML processors handle
some resolution / indirection mechanism, in the same way, so that
publishers don't have to provide SGML Open catalogs for Panorama, and
something else for Vistabrowser, and a third thing for MSIE, and a
fourth thing for ..., and an nth thing for Netscape.  (Well, the nth
thing may not be under our control, but it would be nice if all the
browsers that actually *support* XML are interoperable at some level.)
That means all XML processors have to be *able* to handle the Minimum
Resolution Method (MRM).

I don't think (speaking for myself here) that we particularly want or
need to forbid processors for experimenting with other resolution
methods that might work better in particular environments, so I don't
want to say the MRM must be the Only Sole Exclusive Resolution Method,
or even that it be the primary method.

I'd be open to saying "At user option, conforming processors must
support the MRM" or "At user option, conforming processors may support
resolution methods other than MRM" -- I think these two are equivalent
in effect, but my brain has not yet had its minimum daily requirement of
caffeine so I may be wrong.  What I think these mean, and what I'd be
open to having the spec say, is "You can support any methods of
public-identifier resolution you like, but (a) you must support the MRM
and (b) you must give the user the option of telling you to *use* the
MRM."  (N.B. I am not seriously proposing we use the abbreviation MRM in
the spec; it's just a placeholder.)

I'm also open to saying only that conforming processor must support the
MRM and may support other methods, and how they decide which to use is
to be decided by the designer, the implementor, the user, and anyone
else who horns in on the discussion, but is not constrained by XML.
(Sole difference:  implementations are not required to provide a
user-settable option saying "do it this way".)

>| 3 There appear to be three approaches to resolution that command
>| or could command non-negligible support:
>|
>|   a SGML Open Catalogs, as specified in the current version of the
>|     relevant SGML Open technical resolution
>
>If a catalogue can give as the rhs another public identifier, this
>choice does not really result in specifying a resolution mechanism;
>and if that's okay, then punting to mechanisms entirely outside
>XML should be okay, too.  Catalogues are not resolution mechanisms,
>they are indirection mechanisms, and that's just right.

I'll have to reread the SGML Open spec in its current incarnation,
but the last time I looked I thought the rhs of a PUBLIC entry
had to be a system identifier.

But even if it can be another PUBLIC identifier, this is an issue
only if the full SGML Open catalog is taken as the MRM, which seems
unlikely.

> ...
>My point is that resolution (having power working) is not
>indirection (choosing among PG&E, windmills, solar power, etc.),
>and that any choice of method may result in failure of resolution.

I think this is true, but so universally true that I'm not sure I
can derive any consequences from it.  No matter what resolution
method we choose, it can fail.  If we don't choose one at all, but
leave the choice to implementors, it can still fail.  An implementation
can add support for arbitrarily many resolution methods, but unless
someone has a new software delivery method I don't know about, the
set of methods supported will always be finite, and failure will
always be possible.  (Well, perhaps not:  I suppose one might
contract somehow for *guaranteed* service, but I suspect that means
not that you will always have service but that they will pay you
when you don't.)

>| With regard to these, the ERB leanings appear to be:
>|
>|   a Support for full SGML Open catalogs ...
>|   b ... a suitable simplification of SGML Open catalogs ...
>
>Either way you get indirection, not resolution.  If that's okay,
>it needs to be said what this requirement means for the processor/parser,
>the application (a term not used since the executive summary), or
>the implementation (=application?).  I nag at this point because
>the concept of an XML application is currently a loose cannon.

I'm not sure why this is so, unless I've carelessly given the wrong
idea again.  As I understand it, either (a) or (b) would naturally
be coupled with information about where to find the catalogs, and
with some constraints that would ultimately, for any public identifier,
either produce a system identifier or fail.  I think of producing a
system identifier for a resource accessible to the processor as
constituting resolution of the public identifier; am I misusing the
term?

>But in the current environment of push for solutions to simple problems,
>I would much rather that XML cut loose of this issue.  On the Internet,
>the SGML notions of PUBLIC and SYSTEM aren't too useful, and eliminating
>both in favor of URLs is a win (whatever the metaphysics of URLs).  If
>I want to use URNs, nothing in XML 1.0 prevents me.

Two points of apparent disagreement here.

I think the notions of PUBLIC and SYSTEM are extremely useful on
the Internet, since the ability to provide a *name* (public id) is
a key to enabling software to locate a nearby copy of the thing named,
possibly a local copy, rather than forcing a reload from a distant
server.

And the current draft spec of XML 1.0 does say that system identifiers
are URLs.  There was some sentiment for changing that to say URIs,
and letting the fates decide whether that meant URLs or URNs, but
that change has not been made, that I remember.

>More generally, resolution of indirect names is not unique to XML
>and ought to be dealt with on a system-wide basis, just like resolution
>of URLs.

I always have mixed feelings about insisting on system-wide solutions to
problems that clearly should have such solutions.  If such an insistence
helps *produce* a system-wide solution I can use, then it's good.  But
frequently I have the problem precisely because those responsible for
system-wide solutions have bollixed things up so badly, and cannot be
persuaded to fix them.  In such a case, insisting on the Right Thing (a
system-wide solution) punishes only ourselves, and we'd do better to
find a partial solution we can actually do something about, rather than
wait for a full solution we cannot ourselves create and push through.
(An example:  character-set problems such as identification and
translation ought to have a system-wide solution, at the OS level for
all users of a local system, and at the network level for networks.  But
neither of these is the case for any OS or any network I've yet heard
of, let alone worked on.  All the complaints in the world were not
enough to persuade Rutherford Labs to fix its manifestly false,
unusable, and data-corrupting translate table, when it functioned as the
sole gateway between JANET and EARN/BITNET.  In such cases, one needs to
seek solutions that work around the problems imposed by system-wide
stupidity, and that can be implemented even if those with system-wide
responsibility don't achieve enlightenment.  I suspect this means,
in this context, that I'll believe URNs work when I see them work,
and in the meantime I'm quite happy with SOCats, which do work thank
you very much.

-C. M. Sperberg-McQueen
Received on Thursday, 20 March 1997 11:08:54 UTC