Re: draft proposal for catalog resolution [market distinction]

lee@sq.com wrote:
> I can't imagine many people choosing which XML product to buy based
> chiefly on the way it resolves PUBLIC identifiers.

If the only changes in products that made people buy one or another were
the make or break it ones, then the small features would never get
implemented. They do. You know that the small usability and usefulness
features get added by the hundreds, or thousands. My car has a little
button that allows me to lock all of the doors at once -- it didn't make
me buy that car, but the cumulative effect of all of the little features
could certainly put one car above another -- car companies don't waste
money on the small features for their own good.
> We don't often get asked how Panorama does this when people are choosing
> between Panorama Pro and DynaText, that's for sure.  People work out
> which is cheaper, or which will work in their own environment, or
> which can be customised more easily to interoperate with some particular
> document management system.

And how they handle public identifiers will have an impact on all of
those categories! It is one of a checklist of items that they will look
at and it will swing them one way or the other. It may even be
unconscious: when an SGML consultant is suggesting one product over
another, and is on the verge of choosing, the days they spent trying to
make Author/Editor SOCAT compatible will remind them of what a headache
SOCATs are.

Besides which, we are in a slightly different situation: A/E and AdeptE.
have very similar public identifier lookup mechanisms: A/E uses a file
corresponding to a proprietary format and AdeptE uses a PID
corresponding to a flat file. But browsersmay use wildly different
processes like DNS-lookup vs. Alta Vista vs. LDAP vs. PURL. In some
cases the mechanisms they use may be chosen based on other components of
the company's inter/intranet plan: Microsoft might use ODBC to an Access
database and Netscape LDAP to a Netscape directory server. Those are
fundamental differences that could make or break a product.

> No, thta's not what incompatibilities will do.
> What they will do is frustrate users.

It sounds to me like frustrating users is a Bad Thing that will result
in Lost Sales in the long term: the market will decide.

SOCATs are fairly new, and not an ISO standard. People do not yet know
that they are a standard and the A/E should support them: they will
learn and you will get the calls (or perhaps just the lost sales).

> But by then they
> are for the most part blaming the complexities of SGML, not us --
> and they are looking forward to seeing it fixed with XML.

You don't want PUBLIC identifiers, right? So these customers shouldn't
be looking to XML to fix the problem because in your opinion they should
simply not have the option of using them. I go in circles when I try to
follow this logic:

PUBLIC identifiers are not useful without a resolution mechanism.
Therefore people will not use them (if we can presume people to be smart
enough to only use the features that they need). Why then do we need to
strip it out -- it will just become an ignored spot in the grammar, like
a PID that you don't understand. On the other hand: what if people do
use them, knowing that they are not fully specified -- they must be
doing so because they believe that they can get value from them even
without a specified resolution mechanism.

This is exactly the situation in SGML: I need public identifiers, and my
world would be a much more difficult place without them, *despite* the
fact that I must sometimes fight with non-SOCAT software to get a
catalog installed. If the hassle were too great I am intelligent enough
to just avoid them, like SUBDOC or other features which are (for me)
more hassle than they are worth because of poor editor support.

> But if CATLAOG isn't required, an XML A/E would continue to use
> extid.map, I expect.  Why change when you're having so much fun??

Fine. Better than nothing. If it isn't, I have the option to ignore
public identifiers!
> Our market is not going to prefer one mutually incompatible browser
> or editor or whatever over another -- it is going to prefer HTML or
> PDF, where these hassles go away.

Huh? HTML and PDF make these hassles go away by simply not providing the
functionality that some customers desire.
> If we make the mistake of allowing PUBLIC, we have at least to _try_
> and ensure that every XML processor can handle every XML file on
> the web without human intervention.  That includes no intervention
> by system administration.

That's fine. Peter has twice proposed a mechanism that allows this, and
it was in that context that I made my original comments:

> an identifier must be 
> either          SYSTEM  "url"
> or              PUBLIC "fpi" "url"
> and leave it to the much-vaunted "market forces" to resolve the issue?

*** Note that this system is *at least as reliable* as a system with
only URLs *** and perhaps more, because there is extra information about
how to find the file if the user agent is smart enough to know how to
look it up.
> Paul's CATALOG proposal as redrafted & posted by Michael is a good step.

I agree and support this proposal. Peter's idea is a good fallback,

> The missing benefit is that it's harder to do distributed resource
> mirroring without standardised names, as per Michael's TEI example.
> But we haven't solved that problem, ans if the URN group solves it,
> you can put URNs in your SYSTEM identifiers, and you _stil_ don't need

You can put URNs in your SYSTEM identifiers, but you must *change your
documents* when the syntax is published. One of the major benefits of
SGML (and presumably XML), is document longevity. You should not be
changing your documents to accomodate technology change. If you are, you
are Doing Something Wrong.

 Paul Prescod

Follow-Ups: References: