- From: <lee@sq.com>
- Date: Tue, 1 Apr 97 19:13:27 EST
- To: w3c-sgml-wg@w3.org
Bill Smith's comments are very much to the point, I think. He wrote: > The scheme specific part of a URL can be a name. To my knowledge nothing > precludes defining the scheme specific part of a URL as a name. An equally > intelligent cache can be constructed using fpi:<fpi specific part>. and that's correct. Of course, one can also use urn: there too. There was provision early on in HTML for <A> to take a URN attribute as well as a URL; if that's still in the DTD, it's really there for backwards compatibility, I would guess. People quickly realised that you can start a URN with urn: and have it work. You can even do that with existing browsers, if they are configured to use a proxy server -- I just tried with Netscape, for example, which knows to pass urn: URLs on to a proxy server (which in our case then doesn't know how to handle them yet, of course). > > And of course it's not at all unhelpful to be able to give a name, and a > > default location in case the client cannot process the name. I'd rather allow a list of URLs in a SYSTEM identifier to accomplish this. Note that space is not allowed in URLs. However, if you find yourself giving multiple URLs to access the same thing, and expecting software to try them one at a time until one "works", you're asking for trouble. For ftp: URLs, a message saying "too many people logged in" is often indistinguishable from success, for example... and many HTTP servers now return a default document if you ask for a URL that isn't there. At least some of them also give the HTTP 404 error (NOT FOUND) in that case. But at least some servers also have a built-in time delay for that case, so as to avoid the case where errant robots bring a server to its knees. Netscape doesn't tell helper apps when a download failed -- for Panorama, we have a timeout -- if we've asked more than 40 times for a URL and still not got it, or it's taken more than 6 minutes, chances are that Netscape is sitting there displaying an error message, or silently failed. So you <blink>_*_*_*_*_really_*_*_*_*_</blink> want to avoid this. Finally, it's not necessary to have "if this doesn't work, try _this_" in situations where the pubisher _does_ have control. If you put an XML file up on the web, you know its URL. This is not to say that indirectioon shuld not be supported, but that if you provide a fallback URL, it will be the one you could have provided in the first place, and fetching the document would have been faster (no need to connect for CATALOG). If all XML clients use CATALOG in the same way, the chances are high that if you publish a document, you'll put a CATALOG file there that everyone else's application can read. In that case, no fallback URL is needed. If some XML clients use "CATALOG", some use "catalog", some use "Catalog", and some use whois++ and/or URN resolution instead, you'll need to put several different kinds of catalog file on your server along with your document, and also make a whois++ entry, and perhaps do other things as well, as yet undreamt of, if you want to reach a wide audience. Somone (I forget who) said that at first, only SGML impementors will be using XML. I hope that's false. If it isn't, we didn't need to change SGML: as Peter and James and others who have written SGML parsers will I am sure agree, the goal is to make something that a CS grad with little or no SGML background but some basic web and HTML knowledge will look at and want to use and be able to implement quickly. So it's NOT okay to allow PUBLIC without specifying _exactly_ what other files to fetch to find out how to look up the PUBLIC identifier to find out what URL to try to fetch. If new methods come along, the XML spec can be revised. That is why there is provision for a version ID in the header. Bill wrote: > We only need involve the IESG if we expet FPIs to interoperate. This is a > red herring since (as best I can tell) the list has reached consensus on > resolution interoperability - there won't be any. I don't think we've reached consensus. If we do, I hope it will be to leave out PUBLIC, but if not I think we have to have CATALOG. In that case, I hope we don't have to admit in public that CATALOG is an ascii non-XML non-SGML file because the SGML vendors said it would be too hard to implement if it was in SGML. Ooops. But suppose that we go with a CATALOG in XML. Paul wrote: >> One problem is that it will be _only_ one mechanism once it's defined. >> That violates one of the desired properties of FPIs. (resolution-mechanism >> independence) In fact, this is not the case. The actual resolution is done at the URL level. So if you discover a URN scheme, you can have CATALOG and put in it entries like PUBLIC "-//something//JP" "urn:-//something//JP" if you like. If you're willing to do the http fetch of catalog, another layer of indirection is probably OK until XML is revised. Bill said: > One resolution mechanism that works for (one or more) location-independent > namespaces would be preferable to many abstract resolution mechanisms. For > 10 years we've had PUBLIC and yet we still don't have a single mechanism > that interoperates. I'd vote for one concrete mechanism as opposed to an > infinity of abstract ones. XML should be concrete, SGML is abstract. Yes, I agree. > The rest of us will be left with "404 Not Found" because (I suspect) > fallback XML objects will be maintained about as well as HTML pages. Or worse, because if the PUBLIC ID works, you don't need to test the fallback. And there's always the malicious user, as per Ken Holman's comment about using "", but I'm not sure it's necessary to worry about that. > I'm not convinced PUBLIC is required and suspect that it will be (for > the most part) ignored. If we persist in adding features that will > be ignored, we can predict the fate of XML. Agreed. Very much so. I think maybe some people are underestimating the effect of the first dozen or so "file not found" messages from an XML application. Yes, you can ignore the DTD itself. What about declared entities? What about image files? You don't see inline images in this document because you and Iused different browsers? Come on, folks! The Web continues to grow because of compatibility. Use any browser. On any platform. Yes, there are mistakes -- like ActiveX :-) -- but basic HTML interoperabiity is there now today. Let's not do worse. Lee
Received on Tuesday, 1 April 1997 19:13:23 UTC