April Fools RANT about Catalogs

> I think maybe some people are underestimating the effect of the first
> dozen or so "file not found" messages from an XML application.
> Yes, you can ignore the DTD itself.  What about declared entities?
> What about image files?  You don't see inline images in this document
> because you and Iused different browsers?  Come on, folks!  The Web
> continues to grow because of compatibility.  Use any browser.  On any
> platform.  Yes, there are mistakes -- like ActiveX :-) -- but basic
> HTML interoperabiity is there now today.
> Let's not do worse.

In my opinion, you seriously underestimate the intelligence of web
authors and users. HTML is very happy to allow you to make URLs that
will only work on your machine (such as "file:") or that will only work
with particular software (such as "shttp:"). XML will be no more

So what are the real answers to:

Q: Are URLs interoperable for Web delivery?

   Yes, they are.

Q: Why are URLs interoperable?

   Because there is an agreed-upon mechanism by which documents
   identified with URLs can be delivered over the Web.

No. URLs are interoperable if you use them correctly. They are
non-interoperable if you do not use them correctly. URLs are also only
interoperable on that subset of Web software that supports the
particular URL protocol you have chosen to use. PUBLIC identifiers would
be similarly interoperable on the subset of Web software that supports
the resolution mechanism that your public name is published on.

The only way to stop people from making mistakes is *education*, not
*constriction*. Progress requires us to make choices that will allow
stupid people to make non-interoperable documents. The URL standard
explicitly allowed people to create new URL protocols that would not be
supported by existing software because that offered *power* (within a
domain smaller than the "whole Web") and an *upgrade path*, two things
that we are now being asked to forgo in PUBLIC, which also offers
*power*, within domains smaller than the whole web, (whic are willing to
organize their own resolvers in the same way that URL extenders organize
their own protocols) and an *upgrade path* to a mechanism that will be
fully standardized, global and automatic.

As I've pointed out before, generic markup itself is the largest source
of incompatibility I can think of: If you've ever got an arbitrary SGML
document mailed to you, you know exactly what I mean. Wouldn't it have
been easier if that document had been constrained to HTML? If people
want totally foolproof, automatically interoperable systems, we should
be working on an HTML subset, not XML.

Generic markup is also the source of SGML's power -- the power to define
your own, perhaps non-interoperable documents. XML will not change this.
I will not be able to download one of Peter M-R's chemical models and
spin around molecule models in an arbitrary browser (unless he delivers
his code as a Java applet). XML gives him the power to define something
that is mostly non-interoperable with my browser, because that is what
he needs to do to get his job done. When I define a 3D scene in XML
using an internally developed language, it will be similarly
non-interoperable with "off the shelf" browsers. Is anybody going to
claim that this is an abuse of XML?

The claim that until PUBLIC was introduced XML documents were all
interoperable by definition is a complete myth. Even ignoring the cases
where valid XML documents could be improperly parsed by valid XML
parsers, without the semantics the parsing is usually useless. And we
are certainly not going to hammer out a semantics language that
encompasses Peter's chemical models, my 3D scenes and the needs of
online hypertext too.

Since PUBLIC is not really an obvious thing for a novice user to
"accidently" use (in the way that they might accidently use "file:") I
think that the potential interoperability problems of PUBLIC are being
totally overstated: whether or not we agree on a resolution mechanism.
Broken links will happen, with or without PUBLIC. If you give me PUBLIC,
though, I can prepare my documents for the day when they will be a bad
memory. That, to me, is a massive step in the right direction. If you
give me catalogs, then I can actually make use of my public identifiers
on the web today, while I wait for automatically maintained systems (if
browsers support it). That is even more exciting.

I would go so far as to say that the very existence of the catalog
mechanism, in wide use, might twig researchers to some innovative,
automatic solutions to the broken link problem that involve collecting
and organizing catalogs. It would be tragic if we let this tremendous
opportunity to show some leadership slip through our fingers.

 Paul Presscod

Received on Tuesday, 1 April 1997 23:11:53 UTC