- From: Arjun Ray <aray@q2.net>
- Date: Sat, 19 Feb 2000 00:46:29 -0500 (EST)
- To: W3C HTML <www-html@w3.org>
On Fri, 18 Feb 2000, Russell Steven Shawn O'Connor wrote: > I'm drawing relationships between FPI's and URL to indicate the > perform more or less the same job. Sure. > They are different, and the analogy are not perfect, but I believe > they are adequate for the purposes of identifying a document by > name. Yes. The key point is identification by *name*. > > Have you seen K.4.6 "Internet domain names in public identifiers" of > > the WebSGML TC? http://www.ornl.gov/sgml/wg8/document/1955.htm > > You could have something like this: > > > > +//IDN w3.org::www//DTD > > XHTML 1.0//EN//http:/TR/xhtml-basic/xhtml-basic10.dtd > > Wow, I think I may have heard of this in passing, but forgotten. This > seems really good, maybe we should go with it? I think we should, but if it doesn't, uh, "explain the thinking behind the specifications", there will be resistance from the W3C. > > No. They're exactly the same. The real problem is that, under > > the current rules, a URI can't be the minimum data following the > > PUBLIC keyword. Of course, at root, this is just legalistic > > mumbo-jumbo, and the SYSTEM keyword is the official *kludge* to > > get around this "problem". > Why is it a problem? Why should PUBLIC identifiers be used over > SYSTEM identifiers. ... Um, can you define the semantics of PUBLIC > and SYSTEM identifiers? Don't bother if it is too much trouble. In terms of the great Undead Debate - Names vs Addresses - PUBLIC is basically a "name" and SYSTEM is basically an "address". There's also an understanding that PUBLIC is portable across environments whereas SYSTEM is always necessarily "local", that PUBLIC ids are effectively permanent while SYSTEM ids normally *do* vary, and so on. The common thread is that a PUBLIC id, as a system-independent name, always has to be translated to a locally effective address. From an SGML pov, that's fine, and in fact all that one needs - the emphasis is on the maintainability and permanence of one's *document data*. (Think of read-only media like CD-ROMs - once you've "frozen" the data, like it or not, those are just names in there, because addresses can and do change.) In a sense, SGML is *all* about names, and *only* about names:) The problem was that ISO 8879 forgot to include a standard mechanism for name resolution. Had there been such a catalog system from the start, PUBLIC ids (i.e. names) in documents would have sufficed. Instead, the whole issue of catalogs was punted in favor of a quick and dirty kludge, SYSTEM identifiers. Of course, there were benefits to this: the more "local" or insulated your system was, the more convenient it was to use SYSTEM ids directly, saving the "hassle" of inventing names for non-existent or uselessly indirect catalogs. Well, so the theory and practice went, except that this tactical shortcut of using SYSTEM ids as "names" did *not* obviate inevitable changes in effective addresses. So by the time the SGMLopen Catalog format cam about, guess what? SYSTEM ids needed translation too, to "up-to-date" SYSTEM ids! If that isn't a kludge coming home to roost, I don't know what is:) [Actually, the usages that survived this stupidity were the extreme minimizations - in SGML you can omit the actual literal after the SYSTEM keyword, leaving it to the app to "imply" the effective value. The basic reason why this didn't survive into XML is that XML took a hard line on optional features, so if the SYSTEM keyword had to be there at all - which it did, as catalogs still weren't standardized - then the literal had to tag along. Bleagh.] So, it's not PUBLIC ids but SYSTEM ids that are the real "baggage". Unfortunately, we're also stuck with the formal variant of PUBLIC ids (FPIs), which has been a big problem until the WebSGML TC offered an extension for networks (the +//IDN registered owner class), which now gives us a way to stick the informational content of a URL into a FPI. A win/win:) > I don't see why you say they are the same in the regard. In that what we *need* is to put the URL into a PUBLIC id, i.e. that the literal that follows the PUBLIC keyword should be a URL, or its informational equivalent. There is absolutely no reason why we should not be able to do this. We lose nothing and get exactly what we want - so in the sense of performing the same essential function, except for a bunch of outmoded rules (now thankfully modified) they *are* the same. Your argument has been that this should be possible, and I agree:) The illusion - and fallacy - is to believe that we need the *SYSTEM* keyword for URLs. No, SYSTEM is kludgery and baggage. With a suitable definition, PUBLIC works just fine and is all we need. We also put the name resolution problem - when and where we face it - in the proper perspective - translate PUBLIC via a catalog (with no-ops as convenient) and dispense with SYSTEM ids and SYSTEM->SYSTEM translations altogether. I must be dreaming:) > I can't say that catalogs are never necessary, because I don't believe > it. But in this case the URL as as good as (and probably better) than > the FPI I gave. ... Although your identifier seems better than the URL, > and I think you may have a point about PUBLIC vs SYSTEM, I need to look > into that. > > > > So surprisingly, the URL is actually independent of machine name > > > (because of virtual machine names) and independent of protocol > > > (because of uniformity). > > > > Please explain this "uniformity" bit. What happens with ftp? > I admit this is the most confusing part of my argument. Consider > a DOS like system. [...] Consider drives A: and C:. [...] But > this difference in protocol is transparent to the user, because > weather a drive is a floppy, or disk, or mapped network drive, > doesn't change how the table is access. This is because each > media is access (for the users point of view) in exactly the same > way. This OS has made the access uniform across media. In the > same way access via http: and ftp: are done in exactly the same > uniform ways in a URL. (a mapping from names to data). OK, I see now. The problem, then, is that the path component isn't invariant across this uniformity (i.e. you can't just substitute 'ftp' for 'http' in a URL and expect it to be operational) So do we need to insist on this 'uniformity' concept? (Especially given my straw proposal for what a +//IDN FPI might look like?) Arjun
Received on Saturday, 19 February 2000 00:21:48 UTC