Re: draft proposal for catalog resolution [market distinction]

At 01:56 31/03/97 -0500, Lee wrote:
>I can't imagine many people choosing which XML product to buy based
>chiefly on the way it resolves PUBLIC identifiers.

Not chiefly, but that fact that one handles them and another doesn't
could be significant (or that one handles them in a sensible manner
and another does something silly).

>No, Paul and others, the market won't decide PUBLIC for us.

I wouldn't want to bet on anything these days. Lots of people never
thought IBM's stock would slip :-)

>What they will do is frustrate users.

Right. Like they do at the moment,

>The reasoning has to be based on what will maximise interoperability.
>Our market is not going to prefer one mutually incompatible browser
>or editor or whatever over another -- it is going to prefer HTML or
>PDF, where these hassles go away.

But it's not...that's the whole point. The market will prefer whatever
looks cutest, just like they do at the moment. If two products are
pretty much neck-and-neck, but one keeps complaining "Cannot resolve
FPI" (no names, no pack drill) and the other tries sensibly each of
the options until it finds resolution, then I know which one I'd

>If we make the mistake of allowing PUBLIC, we have at least to _try_
>and ensure that every XML processor can handle every XML file on
>the web without human intervention.  

No we don't. That's the job of the implementors. OK, "we" might mean
you-working-for-SQ, sure. Our job is to provide the handles or hooks
to allow them to do so. If we allow PUBLIC, we have to document what
it's for, and how to resolve it. They can accept or reject this, and
decide to support it or not, regardless of what the "rules" of XML
say (just like now). Their problem.

>People hoping to put URNs in PUBLIC identifiers will have to check
>that it's OK not to have ! @ # % ^ & _ { } [ ] | \ ~ ` ; < > , in
>URNs, as they are forbidden in PUBLIC identifiers.  Perhaps SGML
>could be changed here, as there doesn't seem any advantage to
>restricting the character set, and it's going to look odd to allow
>Kanji or Devanagari or accented Latin characters in SYSTEM IDs
>such as file names and URLs (URL internationalisation is in progress,
>but file: URLs are already OK in practice at least, and you can
>escape characters in URLs with %, a character not allowed in a
>PUBLIC Id) and have A-Za-z 0-9 and a little punctuation in PUBLIC
>identifiers, that are supposed to be more powerful.

This is utterly specious: the character set used to express an FPI has
nothing to do with any supposed power of resolution. C does not
permit certain characters in variable names: does this make C any
less powerful than something that does?

    <!DOCTYPE xx [
	<!Entity catalog % SYSTEM "how to get catalog.xml">
	<!--* catalog.xml defines the yy entity *-->
It's all just too much, especially when the gains of using PUBLIC
>seem so small.  With [5] above you get almost all the benefits anyway,
>with a lower implementation cost, one fewer langauge, and heck, 

But where in your [5] do I state what doctype this is, what it's called,
and who is responsible for it, in a machine-processable manner that
is robust and unambiguous and not subject to mutability? Or is this
asking too much of XML?

I simply don't see the problem. Implementing PIDs is not rocket science:
I did it for my server in about 20 mins. OK, it doesn't handle all the
vagaries yet, and a well-aimed shot will probably flake it, but if I
had a day or two, and all the resources of a commercial developer, I
could make it reasonably bulletproof. Doing it using catalogs is even
easier, as it's just a lookup instead of an algorithm. 

>if we removed SYSTEM xxx from DOCTYPE, a smaller language!

Over my head. I don't see how you would do this.

>The missing benefit is that it's harder to do distributed resource
>mirroring without standardised names, as per Michael's TEI example.
>But we haven't solved that problem, ans if the URN group solves it,
>you can put URNs in your SYSTEM identifiers, and you _stil_ don't need

URNs are not a substitute for PIDs unless there's something special
about them that I've missed which lets them identify the same things
in the same kind of way.