Re: ERB discussion of public identifiers

Tim Bray <tbray@textuality.com> writes:
>We spent a lot of time on this question on Dec. 11th, and it is clear we
>need some more help from the WG.
>The ERB is, by at least a substantial majority, convinced that there is a
>real nead for PUBLIC identifiers in XML.
>The ERB is highly concerned, in at least a significant minority, about
>the effects of putting this facility in without specifying a resolution
>mechanism.  Doing so would contravene one of the major design goals of
>XML - that any compliant XML processor should be able to read any compliant
>XML document.

This is in contradistinction to other markup specifications. Is XML trying
to specify the world around it? Are we specifying the behavior of an XML
processor (parser?), or an entire XML environment? Note that while HTML
mentions things like URLs and http, implementation details of other
protocols, etc. are left to those other specifications.

>On the other hand, there is also substantial concern about giving an
>unconditional blessing to any particular name resolution mechanism at
>this point in history.

Given that one is well accepted but changing, the other not ready for prime
time just yet. I don't see why there is a need for an "unconditional
blessing". This sounds like a red herring. Merely describing a solution as
*one option* for resolution without restricting others seems like the path
of least resistance.

>Thus, there are a variety of options open to us.
>1. Leave it as it is

This disallows a feature that *seems* to be demanded by many potential
users of XML, forcing them to continue using 8879 unless they are willing
to give up use of public text or embed system info in their documents. For
example, I imagine an XML-ized HTML DTD available for XML developers might
be a good idea. Well, how would that be distributed? A fixed SYSTEM URL?
yuck. Not much for standards.

>2. Agree that we'll put PUBLIC identifiers into XML *when* we are ready
>   to specify the resolution mechanism; the practical effect is almost
>   certainly that they don't go in for now.

Again, this begs the question of why the XML spec *needs* to create a
complete solution in the first place. Putting off the discussion for later
versions seems dangerous. I can see FONT and BLINK on the horizon.

>3. Put a slot in the syntax for PUBLIC identifiers...
>   3* ...and require an accompanying SYSTEM identifier

*Requiring* a SYSTEM identifier is tantamount to requiring authors to
hardwire URLs into their documents, which is (at least in my mind) the
principal objection to the status quo.

>   3a - say nothing about resolution mechanisms, perhaps providing a
>        taxonomy of available technologies in this area

This would be my choice.

>   3b - document one resolution mechanism, but not make it required.
>        This is somewhat similar to our i18n approach, where we bless 2
>        encodings but admit the possibility that people use others.

This would be fine, but I'm guessing some folks don't want anything to do
with putting Socats in the spec, judging by the discussion. I think if we
did, it would be best as an appendix.

>   3c - document one resolution mechanism as before, but make its use
>        mandatory for XML processors.

This limits the XML specification, as well as shackles development of
better approaches in the future. While it may promote interoperability at
some level, I suspect there are many who would either break the rule or
simply ignore XML.

> Note that there is a continuum between 3b and 3c; we could place
> varying strengths of recommendation behind one resolution mechanism,
> with homilies about document portability.


>In the area of which resolution mechanism to (perhaps nonexclusively)
>bless, SGML/Open catalogs (hereinafter Socats) stand out, and would
>probably be the ERB's choice.

My choice as well. Particularly if we limit socats to PUBLIC only,
disallowing such things as LINKTYPE, SGMLDECL, DTDDECL and the complexities
of delegated catalogs. This would reduce it to a simple lookup table, which
should be extremely simple to parse for content.

>On another hand, there has been a lot
>of work go into the URN effort; on another hand, that work has not yet
>born practical fruit in terms of ubiquitous implementations; on another
>hand, the FPI syntax is repellent to some and it is not clear how well
>it supports internationalization; on another hand, it may be the case
>that FPI's really are URN's as they stand.

These are issues that have yet to be decided. We can't wait until they are,
given that timeline on this spec is so short.

>I suspect that if a binding vote were taken today, the ERB would either
>(a) reinstate the PUBLIC keyword, and put in a nonexclusive recommendation
>    for Socat support, or

My vote. (well, I don't get a vote, but whatever)

>(b) refuse to put PUBLIC in until there was agreement on a required
>    resolution mechanism.

As I mentioned above, I think this puts an unnecessary onus on the XML
specification to provide a solution, which seems unlike other markup


    Murray Altheim, Program Manager
    Spyglass, Inc., Cambridge, Massachusetts
    email: <mailto:murray@spyglass.com>
    http:  <http://www.cm.spyglass.com/murray/murray.html>
           "Give a monkey the tools and he'll eventually build a typewriter."