- From: David Durand <dgd@cs.bu.edu>
- Date: Mon, 31 Mar 1997 16:38:43 -0500
- To: w3c-sgml-wg@w3.org
The issue of naming versus resolution is still coming up, in misunderstood form. For example, Lee's 6 equivalent mechanisms did not in fact offer equivalent "naming" behavior, though they do offer equivalent _resolution_ behavior. I will try to point out an effective operational property of names that no resolution mechanism can provide. I also point out that CATALOGS are the best solution to the stylesheet problem. In fact a better solution to that, than to the name resolution problem! The most important fact about PUBLIC IDs is that they define equivalence classes of documents. In order for these classes to make sense, we need to ensure global uniqueness of names. While this seems hard, in practice using hierarchical namespace delegation, and simply encouraging long identifiers gets us 99% of the way there. Even unregistered FPIs have turned out the be pretty robustly unique in practice. This is a social contract, but it's in fact less-onerous than the social contract implied by a URL -- which is that the owner will maintain an HTTP server, at a specified DNS location, with a specified resource on that server. People fail this contract all the time, of course, because to meet it requires money. Any time I see a PUBLIC ID that I've _ever_ seen before, it means that I am justfied in re-using the results of my last resolution. So, PUBLIC IDs are like a super-distributed cache. A browser using FPIs (even without ever resolving one) can cache the first copy of the TEI DTD it ever sees, based on the SYSTEM ID provided along with thar PUBLIC ID, and then _never fetch it again_. Further, even thought FPIs (and all names) may become broken, a broken FPI is actually more informative than a broken URL, as it generally contains a lot of human-readable information that may help someone who absolutely _must_ resolve it. And anyway, it's certainly no _more_ broken than a broken URL, and somehow people seem to be able to live with a lot of those! The forgoing is just to note that the objection that an unbreakable URN/FPI service is required for useful naming is just as bogus as the claim that a foolproof location mechanism is required for URLs to be useful. In fact, the biggest insight that TBL had (in my opinion) is that the academic hypertext community's insistence on consistent linkbases (unbreakable links) was a demand for an unnecessary perfection. We all dismissed the WWW as (possibly) interesting but obviously broken... and failed to see that it was damned useful, even if it was "broken". FPIs are also damned useful, even if we don't have _one way_ to resolve them. Now, as to CATALOGs per se, I'm agnostic. They are certainly a reasonable mechanism for resolution, and with delegate, they are a plausible draft of a distributed resolution mechanism. So I don't object to them. I actaully rather favor them for another reason: we are heading down the road to including PIs to locate style sheets, and I don't like that for a number of reasons: 1. I _don't_ want to be forced to put processing information in my documents. 2. I think it's a bad idea to _ever_ put processing information in documents. 3. Practically, even with HTTP 1.1, it's not an effective implementation strategy. 1, and 2 are very controversial, and I'll state them without further discussion. 3, however, has been bothering me for a while. HTTP 1.1 fixes the multiple connection problems of HTTP 1.0 and allows several commands to be bouncing down the wire at a time (which is very good). _but_ it is not a multiplexed connection -- one whole resource must be transmitted before the next one can be fetched. This means that once I start fetching the document, I can start fetching the stylesheet, and the catalog, but I can't use either of them until the document is all there. What we really want (if we are to use HTTP 1.1 effectively) is a way to fetch a "manifest" for the document, and then select and fetch the document, (one or more) stylesheets, and so forth, in whateever order makes most sense. In practice, probably style-sheet (if we don't already have it), DTD (if we need it, and don't already have it), document, then external entities. Now a CATALOG looks like a great nominee for the manifest. So much so, that I'm very tempted to say that we _should_ require catalogs, not for the PUBLIC->SYSTEM mapping, though that it a useful side effect, but so that we can require that the standard URL for a "document" may identifiy a CATALOG with a DOCUMENT entry, identifying where the XML parsing should begin. So I vote for adding CATALOG as a requirement (if we use it as a manifest), and even without that (if it's the only way we can get PUBLIC) I don't mind if we suggest/require it as a default/fallback resolution mechanism (though I am not convinced that we need one). In any case, I think that a clear property that PUBLIC has that has not been stated clearly before is that _any_ particular resolution mechanism is allowed, if the application knows that a URL is a legitimate copy of a PUBLIC ID'ed item. In particular, implementors and authors should understand that a PUBLIC is an explicit license to cache a resource and access it any other time that its PUBLIC ID is found. We need to state this quite explicitly, so that it is clear to authors that _if_ they use PUBLIC, then they _must_ use a different PUBLIC ID for each non-equivalent version of an entity -- and if they don't their documents may _not_ work, or will not work _consistently_. This is the same kinf of commitment you make when you advertise or use a URL -- you have made a commitment to keep that URL valid, if you want your documents to work. -- David _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________
Received on Monday, 31 March 1997 16:43:12 UTC