- From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
- Date: Mon, 17 Feb 97 09:23:31 CST
- To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
On Mon, 10 Feb 1997 20:45:01 -0500 Tim Bray said: >I think we should: > > - allow a PUBLIC keyword/string as appropriate I agree, *if* we also require, or at least recommend, a basic resolution mechanism. Developers with clever new ideas can always add their resolution code as extensions, and take the market by storm. Users should however be able to tell a system "I don't want your (*%&*$^&) extensions, just do the standard thing". Which requires having a standard thing the user can count on. Hence my preference for requiring a standard resolution method. People with existing working systems that use different resolution methods (special directory names, sequential search through the net, the I Ching, sortes Vergilianae, what have you) can continue to use them. If these systems do not support the standard resolution method, they won't interoperate with other XML systems, and so they shouldn't claim to be XML conformant. If they want to be XML conformant they should do some work to ensure interoperability. If we can't bring ourselves to require anything at all, preferring to make developers like us at the expense of users, who will curse the days we were born, at least we should have the nerve to make a recommendation.) > - make a decision as to what should be done when both PUBLIC and > SYSTEM are there PUBLIC before SYSTEM. (Answer to Lee: the system ID is there as a fallback for when I process the document with parsers that don't support socats, and in some cases don't support PUBLIC at all. It is not there to override the PUBLIC id.) (Answer to David: good point about complex resolution strategies. Provide them as a user-selectable option, if you like, but don't *REQUIRE* XML users to live through the same incompatibility hell that SGML users currently put up with. The simple strategy PUBLIC before SYSTEM works and judicious catalog handling can make it work precisely as I like, all the time, as long as the system looks for a local catalog before looking for the catalog that came with the document. Judicious catalog handling would be a lot easier if the socat were an SGML document instance, but that's apparently water under the bridge now.) > - provide a pointer to TR9401 in the write-up on the PUBLIC > identifier, saying that this is one proven-in-practice way to > resolve them I like Paul Prescod's idea (seconded by Jon and others) of publishing the subcommittee draft (with extensions) as a separate W3C WD on catalogs and catalog-based resolution. > - provide support to catalogs by providing a special PI, as > recommended by a couple of WG-ers, to help define the BASE and > thus find the catalog Right. But local catalogs need to come first, or I'll never get my ICADD-enhanced HTML, I'll always get the naked ICADD-deficient HTML on people's servers. > - investigate the problem of what seems like the unnecessary > restrictions on MINIMUM LITERAL; I don't think it's legitimate to > say that a PUBLIC identifier can't be a URN, which this would do. Right. Until and unless WG8 changes the character set for minimum literals, we can't change the set allowed in PUBLIC identifiers. I assume that '#' is forbidden in minimum literals because ISO 646 allows pound-sterling to occur at the same location (or did when 8879 was published; I assume it still does but haven't seen the current 646). Since in general I think the principle of allowing, in minimum literals, only what is really safe across systems, this seems like a good decision by WG8, and a really stooopid one by the designers of URLs. Now that we are stuck with trying to achieve compatibility with this stoopid choice, I think maybe 8879 should allow #. Implementors will know, or should know, to be able to handle pound-sterling as well. ---- Reading the thread on the catalog draft persuaded me we need several things I have not yet found on the net: - a clear tutorial on URL syntax, explaining the canonical positions on URL structure, meaning of '#', '?', ';', etc., preferably with references to the RFCs that actually define the stuff - a clear tutorial on the FPI syntax defined by ISO 9070, for those of use who understand the FPI syntax of 8879 but not 9070. If anyone can suggest URLs where such tutorials can be found, you'll have my undying gratitude. (URLs for clear treatments of MIME and MIME/SGML would also be gratefully accepted.) -C. M. Sperberg-McQueen
Received on Monday, 17 February 1997 11:05:00 UTC