Re: XML catalog draft

I find myself agreeing completely with James.

On Paul's notes:

> 1.  It was a goal to allow the catalog to map public identifiers into
>     URNs.  (Do we want to open up the defn of sysids in XML to include
>     URNs, or does the defn of URL already include URNs?)

The solution to this seems to me to use URI everywhere instead
of URL or URN, and allow any system identifier (whether in
a CATALOG or an instance) to be a URI.

> 2.  It was a goal to allow XML catalogs to be used with SGML/XML
>     repositories (aka databases).  For example, it was felt important
>     to allow an entry such as:
>    PUBLIC "corporate legal boilerplate C312" "<corpdb>get_bplate('C312')"
>     where the rhs is some database call (whether expressed in FSI syntax
>     as in this example or not).

This I believe is bogus.  You could just as well use
    PUBLIC "[...] C312" "get_bplate;id=C312"
and be legal -- this is a partial URL that does the same thing.

> 3.  It was a goal (albeit a less important one) to allow catalogs to 
>     have FSIs on the rhs. 

We have already agreed not to require XML systems to support FSIs.
They are too complex.

> 4.  Though URLs remain the most obvious thing for a sysid to be, it
>     might make sense to allow the market to decide what sysids will
>     work.  This is one place where it might be best to suggest that
>     URLs give maximum interoperability, but to allow other things
>     to develop.  After all, any decent XML tool that finds a sysid
>     that isn't a URL is going to try its best to figure it out anyway,
>     so how is restricting it in the XML spec going to help?

I don't see the logic here at all.  You don't maximise interoperability
by allowing non-standardised strings.  The market is not capable of
standardising -- if it was, we wouldn't be here, and WG8 could disband.

> [...] I'm quite sure that no one in our subgroup
> would feel it inappropriate for the ERB or WG to decide to restrict the
> rhs of catalog entries to URLs, especially if the concerns I list above
> are considered and either addressed or deemed beyond the scope of XML.

I think in this light that I hope the ERB does this...

I should add that with the reservations I've stated, and also with
James' reservation about requiring catalog support at all, I think
this is a sensible and helpful proposal.

I want to make sure that it does not forbid a system that uses a remote
lookup in a database instead of a catalog (for example).

We need to be 200% clear on what happens if both system and public ID
are given:
    * is catalog consulted?
    * what happens if the catalog lookup gives a different result than
      the system identifier?  For Panorama, the system identifier on
      the DOCTYPE line usually wins in that case.  But it actually
      depends on the user's preferences.  Author/Editor ignores
      the PUBLIC identifier if there is a SYSTEM one.

We also need to say what happens when catalog is not found.

We also need to say exactly where to find the catalog file.
Does a processing instruction in the document header point to it,
for example?  What if the XML document is generated as the result of
a cgi query, so that there is no "current directory" of the document...?
What if we don't have a "base URL" for the XML document?

I know the proposal doesn't answer these questions.  It needs to.