- From: Paul Grosso <paul@arbortext.com>
- Date: Mon, 31 Mar 97 11:42:35 CST
- To: w3c-sgml-wg@w3.org
> From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU> > > 3 Resolving PUBLIC Identifiers (New section) > > An XML processor can resolve a public identifier to a system identifier > by looking up the public identifier in a supplemental catalog, which has > the following structure: > > XMLCatalog ::= S? ( ( catComment | pubEntry | otherEntry ) > ( S ( catComment | pubEntry | otherEntry) )* )? > catComment ::= '--*' (Char* - (Char* '*--' Char*) '*--' > pubEntry ::= 'PUBLIC' S PublicID S SystemLiteral > otherEntry ::= catKeyword (S SystemLiteral)+ > catKeyword ::= (Char* - (S | SystemLiteral | 'PUBLIC' > | PublicID | catComment)) I am not capable of determining (at least, not in the time I can allot to it) whether the above catKeyword production is correct. What is necessary (and what my writeup attempts) is that the catalog parser can reliably find the beginning of the next catalog entry regardless what combination of the "otherEntries" this processor recognizes and doesn't recognize and regardless of the number (possibly variable) of arguments any unrecognized keyword may take. > > A catPublic entry maps a public identifier into a system identifier, > which may be used to locate the entity itself. For example: > > PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN" "iso-lat1.gml" > PUBLIC "-//ACME//DTD Report//EN" "http://www.acme.com/dtds/report.dtd" > > The catalog format is that defined by SGML Open Technical Resolution > 9401:1995 (Amendment 1 to TR 9401), which defines several keywords in > addition to PUBLIC. These are matched by the otherEntry rule, and may be > ignored (or acted on) by XML processors. Speaking of logical names versus specific locations, I would prefer that the reference above is to TR9401 (no date/amendment level/version). I hope to have TR9401:1997 out for vote by SGML Europe. It would add CATALOG, DELEGATE, OVERRIDE, NOTATION, SYSTEM, and a few other things. > > If the public identifier in a catalog entry matches that given in an > ExternalID, then the system identifier in the catalog entry is > associated with the public identifier in question and may be used to > retrieve it. Before matching takes place, both public identifiers must > be normalized: leading and trailing white space is stripped, and > embedded white space is replaced by single space (#x0020) characters. > (Except that no entity references are recognized, this is the same > normalization as is performed for attribute values of type CDATA.) The I don't think this bit is necessary. Neither the & nor % character is allowed in public identifiers, so why talk of entity references? > catalog lookup may involve more than one catalog file; it ends when the > first matching entry is found. TR9401 very carefully talks of one or more "catalog entry files" making up a logical catalog. It avoids the phrase "catalog file" since this seems ambiguous. I say this for the sake of terminology compatibility with TR9401--I don't plan to enter into a terminology debate, and I'm willing to go with any wording for XML so long as it is unambiguous. > > At user option, the XML processor must look first for a catalog file on > the local system; the location of this catalog file, and the method of > identifying it, are outside the scope of this specification. If no > matching entry is found in the local catalog, the XML processor must > look next in the default catalog. Unless otherwise provided by > information outside the scope of this specification (e.g. a special XML > element defined by a particular DTD, or a processing instruction defined > by a particular application specification), the default catalog is that > found using the relative URL catalog . If no matching entry is found in > either the local catalog (if any) or in the default catalog (if any), > then the XML processor may treat the catalog lookup process as having > failed. First, to comment on the above, I have to define my terms: in my vocabulary, there is only one (logical) catalog effectively composed of an ordered "list" of catalog entry files. Whereas you can use different terminology if this is the only way to get catalogs into XML (though I fear confusion if/when people read/know of TR9401), I cannot make intelligent comments without using the terminology carefully. Given my definition, there is no such thing a default/local catalog; there is only the concept of allowing the user (either at the individual level or at the system-administrator-configurable level) to specify to the processor an ordered list catalog entry files. Said list might well be "first this local file, then that default file, then whatever else might have been specified via some PI in the document." I'm concerned with your two-layer local/default approach. I think it is too prescriptive. In particular, what I want is to look first in the document-specific catalog entry file (e.g., the "catalog" URL, relative to the document instance) before looking in the default catalog entry file on my local system, then next look perhaps at a default catalog entry file on the system on which I found the document. Since you are not being explicit how one might specify any of the catalog entry files, why be explicit about their number, order, or location? By way of trying to make a specific suggestion, here is my quick attempt: At user option, the XML processor must attempt to locate and process one or more catalog entry files in the order specified by the user. These files may reside on the local system, in a location relative to the entity in which the public identifier was used, or elsewhere. The the method of identifying the ordered list of catalog entry files is outside the scope of this specification. If no list of catalog entry files has been given to the XML processor, the default list shall consist of the single catalog entry file specified by the relative URL "catalog". If no matching entry is found using the list of catalog entry files as determined above, then the XML processor may treat the catalog lookup process as having failed. > > If catalog lookup on a public identifier fails, [possibly add: or an > attempt to retrieve the entity using the result of the catalog lookup > fails, -Ed.] and a system identifier was supplied in the externalID, > then an XML processor must behave as if the system identifier was the > only identifier supplied. > >
Received on Monday, 31 March 1997 12:49:38 UTC