Straw-man XML Support for PUBLIC Identifiers and XML Catalogs

--------- Section 4.2.2 "External Entities" in Working Draft 14-Nov-96
includes the following excerpts:

[69] ExternalId := 'SYSTEM' S SystemLiteral
[70] SystemLiteral := '"' [^"]* '"' | "'" [^']* "'"
The Literal that follows the keyword SYSTEM is called the entity's
system identifier.  It is a URL,...

--------- Straw man (please help, this doesn't look precise enough in

[69] ExternalId := 'PUBLIC' S ExternalLiteral ( S ExternalLiteral )? |
                   'SYSTEM' S ExternalLiteral
[70] ExternalLiteral := '"' [^"]* '"' | "'" [^']* "'"

The ExternalLiteral that follows the PUBLIC keyword is called an
entity's public identifier.  The ExternalLiteral that may follow the
entity's public identifier and the ExternalLiteral that must follow the
SYSTEM keyword are called an entity's system identifier.  

An XML processor may retrieve the content of the entity using the public
identifier as an index into an associated XML catalog (see the
definition of the catalog for details).  If the XML processor cannot
successfully use or resolve the public identifer, and a system
identifier follows the public identifier, the XML processor shall behave
as if the system identifier was the only identifier supplied.  If the
XML processor cannot successfully use or resolve the public identifier,
and a system identifier does not follow the public identifier, the XML
processor shall behave as if a system identifier could not be resolved.

The system identifier is a URL,...

4.3.1 XML Catalog Specification

When an XML processor encounters an external entity defined by a public
identifier, the processor may, but need not, determine the associated
system identifier required for the tasks described in section 4.3 of
this specification from a system-sourced supplemental XML catalog entity
defined as follows:

[CAT1] XMLCatalog := S? ( ( catComment | catPublic | catEntry ) S )* 
                     ( catComment | catPublic | catEntry )?
[CAT2] catComment := '--' [^-]* ('-' [^-]+)* '--'
[CAT3] catPublic  := 'PUBLIC' S catLiteral S catLiteral
[CAT4] catEntry   := catKeyword ( S catLiteral )*
[CAT5] catLiteral := '"' [^"]* '"' | "'" [^']* "'"
[CAT6] catKeyword := [^S"'] [^S]*

In [CAT3] where the catKeyword is 'PUBLIC', exactly two catLiteral
arguments are required: the first is the ExternalLiteral of an entity's
or notation's public identifier and the second is information used to
locate the entity or recognize the notation.  An XML processor may
choose to determine a system identifier from this second catLiteral
value.  Users of an XML processor that determines system identifiers
from the second catLiteral value of the catPublic production must ensure
that the syntax of that value is compatible with the processor in use


A few things about this straw man ... 

(1) - it does not recognize the embryonic efforts that have started in
SGML Open regarding NOTATION declarations, but I don't think that it can
... we still need to specify the use of ExternalId in XML[77]

(2) - "... information used to locate ... or recognize ..." is far too
vague, but I wanted to get in that this may just be human information
for the reader and not always system information ... it could also be
the full FSI specification ... it could just be "<url>" followed by the
URL (so that it could be more in the future) ... I've left the exact
wording of whan an XML processor can do with the second catLiteral to
someone who can wordsmith more quickly than me

(3) - the bit about "users of XML processors ... must ensure ..." is
there in order to reinforce the "client-side-resolution" and show people
that it the catalog is not the resonsibility of the author of the
information, but rather, the consumer of the information (though it
behooves the author to be helpful here)

(4) - it does recognize that the use of the PUBLIC identifier is
optional by the author; that the author could also supply a SYSTEM
identifier in the event that the consumer cannot resolve PUBLIC
identifiers; and that an XML processor is not required to support PUBLIC

(5) - we could define the productions as follows:

[69] ExternalId := 'PUBLIC' S PublicId ( S SystemId )? |
                   'SYSTEM' S SystemId
[70] ExternalLiteral := '"' [^"]* '"' | "'" [^']* "'"
[70a]PublicId   := ExternalLiteral
[70b]SystemId   := ExternalLiteral

Please feel free to discard this all and start from scratch ... I just
wanted to get something written down that will show that this need not
be onerous.

...... Ken

G. Ken Holman            Tel:  +1 613 596-CADE(2233)    /\ /\
Chief Technology Officer Fax:  +1 613 596-5934          \/ \/   Computer
Microstar Software Ltd.  WATS:  1 800 267-9975        /\     /\ Aided
3775 Richmond Road       Mail: gkholman@microstar.com \/     \/ Document
Nepean Ontario           Info: cade@microstar.com       /\ /\  
CANADA K2H-5B7           Web:  http://www.microstar.com \/ \/
1605 Mardick Court, Box 266    Phone:  +1 613 489-2987
Kars, Ontario CANADA K0A-2E0   E-mail: gkholman@CanadaMail.com
Version: 2.6