RE: What to do given both SYSTEM and PUBLIC?
Forgive me if this is a repost. Doesn't look from my end as though it went
out the first time. -Dave
At 5:06 PM 2/11/97, Kurt Conrad wrote:
>I lean towards giving the public identifier precedence.
>The knee-jerk argument is "why bother" with public identifiers if the system
>IDs take precedence. If I can't rely on the catalogs that I create, why
>should I invest the effort to maintain them. This would be especially true
>in an organizational setting, where such catalogs are increasingly likely to
>be managed as shared resources to improve consistency across a collection of
>documents. If as a publisher, I don't want the 'standard' component, I should
>not code a reference to it.
The original intent with SGML, as Goldfarb has explained it more than once,
was that the system identifier is king; if it's not present, then the system
is to try and impute one with whatever information (like a public identifier)
it does have. If I understood correctly, the scenario he envisioned was that
one would rarely use both, and if both were present it was because for some
reason someone wanted to (presumably temporarily) override the standard
system mechanism for resolving the public identifier.
SGML's primary original purpose was to be a common interchange format, not
only between systems sharing the same storage resources but between disparate
systems with wildly different storage resources and organization formats
thereof. And a system identifier on one system is not typically of much
use on another. Under Goldfarb's scenario, if you're sending a document
"off-site", you have a moral obligation to strip out your local system
identifiers so as to avoid confusing the entity manager on the receiving
system. (But many don't, so the receiver has to.)
Point 1. "System identifier" is prime is not just a knee-jerk, lazy reaction.
Point 2. The intended XML situation is complicated by the fact that
"disparate" systems may be sharing storage systems via the web. From our
point of view, a system identifier that is not a URL *should* have been
stripped. (Will all who put XML on the web be so considerate?) But if
the system identifier *is* a URL, should we have the capability to
override it? I think so, and that's why I think the "original scenario"
as described above is not the way to go. I think one should be able to
override locally, and so, I favor:
o If both identifiers are present, try to resolve the public
identifier using the local system (catalog, whatever).
o If bot are present but the public identifier has no resolution
on the local system, then try the system identifier.