- From: James Clark <jjc@jclark.com>
- Date: Thu, 20 Mar 1997 15:02:54 +0700
- To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
> c Public first, then system (if the public id is not found in the > catalog). One vote for this. > > d Implementations may choose which to try first, but if the first > ID it tries fails, then the implementation should try the other > one. I.e. implementations may *not* say "If both a PUBLIC and > a SYSTEM identifier are given, the XXXXX one is processed and > the YYYYY one is ignored." Strong support for this view. There are two ways in which use of a public id can fail: 1. resolution can fail (eg because there's no entry in the catalog) 2. access to the entity can fail (eg because the specified file does not exist) even though resolution succeeded I think it's a reasonable design to say that if a public identifier fails in the sense of (1) and there's a system identifier than that system identifier should be used. I'm not in favour of saying that if a public identifier fails in the sense of (2) and there's a system identifier then that systen identifier should be used, and I'm not in favour of saying that if a system identifier fails, then any public identifier should be tried: - If the user has put an incorrect system identifier in a catalog or a document, then that's an error, and a validating parser should tell the user about it. If the user, for example, mistypes a filename in a catalog, I don't think you are doing them any favours by trying to silently work around their error. Why do users need to put invalid system identifiers in documents or catalogs? - In a general SGML context, a system identifier can consist of multiple storage objects. What does it mean for such a system identifier to succeed? Does it mean that access to the first storage object succeeded, access to all of them succeeded or access to one of them succeeded? What does the implementation do if access to the first storage object succeeds, but access to the second storage object fails? - Access to a single storage object can fail in multiple ways, for example the object may not exist, the user may not have permission to access it some sort of I/O failure may occur. Should all modes of failure me treated alike? It might be reasonable to press on if the storage object doesn't exists, but do you really want to press on silently if there's some sort of permissions or I/O problem? - Implementation, for SP at least, would be non-trivial. SP's approach is to use the external identifier to generate a system identifier when it encounters the declaration. The generated system identifier is then used to access the entity when needed. Documents may declare many entities that are never accessed (for example for use as link ends) so it's not desirable to access the entity until it's needed. The application that needs access to the entity may be very loosely coupled with the parser, so it's desirable to make it as simple as possible for the application to access the entity; this is achieved by having the application access the entity using only the generated system identifier. I would have to implement something like the FSI altsos option whereby a system identifier "<osfile>foo.sgm</osfile>|<osfile>bar.sgm</osfile>" means try either foo.sgm or bar.sgm. Do we really want to require this sort of complexity of XML implementors? James
Received on Thursday, 20 March 1997 03:14:06 UTC