[Prev][Next][Index][Thread]

Re: XML catalog draft



> From: murray@spyglass.com (Murray Altheim)
> 
> dgd@cs.bu.edu (David Durand) writes:
> >
> >Since PUBLIC is likely to be a point of user-tailorability, it should be
> >looked at first -- implementations that don't implement PUBLIC resolution
> >will simply ignore the PUBLIC, thus causing it to "fail". I can't think of
> >a case where someone who _has_ working public resolution, would prefer to
> >use the system ID -- andif they did, it seems they could always ensure that
> >any given public ID (or all) would fail to resolve.
> 
> Actually, that's the opposite situation in my experience:
> 
>     <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"
>                           "http://www.cm.spyglass.com/dtd/html.dtd">
> 
> This announces to the world that the document conforms to HTML 2.0, but
> tells the processor that a local copy of 'html.dtd' will provide the
> resource without resorting to a PUBLIC catalog lookup. IOW, why bother
> resolving the reference if the document seems to know where to look. Then,
> if the SYSTEM fails, resort to the more generalized process of a catalog
> lookup using PUBLIC.


But suppose I have a local copy of the HTML 2.0 DTD (and who wouldn't),
so my local system default catalog has:

  PUBLIC "-//IETF//DTD HTML 2.0//EN" "/home/dtds/html.dtd"

If you prefer SYSTEM, then there is no way to find the local copy in
favor of the spyglass one (unless the latter fails).

Suppose I want to avoid the unnecessary http download.  Suppose it's
are very big file (not just the HTML 2.0 DTD).  Suppose a few years
from now people start charging for accessing files.  Suppose I retrieved
the instance, its sgml declaration, its DTD, and all the entities it
references in a MIME-SGML package so it makes no sense to redownload
the DTD and all the entities once again.  Suppose for whatever set of
good reasons known only to me, I really need to use a slightly modified
version of the DTD when my application reads HTML 2.0 document instances.

For all these reasons, it seems important to attempt the lookup that
allows end-user (or at least end-user-site) indirection before
requiring that the system id be attempted.  If (as Terry is about to
say), the creator/publisher of the material does not want to allow
indirection or replacement, then they would not have included a public
id but would have only put a system id in the declaration.

In other words, in answer to "why bother resolving the reference if the
document seems to know where to look" I'd answer "why bother to have a
public id that allows the all-important indirection if you going to
ignore it in most cases."

Public ids aren't just a fallback for broken URLs, they are a very
important way to manage indirection of resource name resolution.

paul