- From: Russell Steven Shawn O'Connor <roconnor@uwaterloo.ca>
- Date: Fri, 18 Feb 2000 11:51:10 -0500 (EST)
- To: W3C HTML <www-html@w3.org>
I'm a little surprised to find myself saying this, but after thinking about it for a while, I've come to the conclusion that a URL identifier for XHTML is as good as, and probably slightly better than an FPI identifier. I was pretty convinced that the URL, was just a location, Such and such a file on a particular machine, retrieved by a particular protocol, whereas we want an identifier that says, this the the XHTML DTD, which is independent of protocol, and machine. But lets look at the URL more carefully http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd It has (more or less) 3 parts: ``http'' ``www.w3.org`` and ``TR/xhtml-basic/xhtml-basic10.dtd'', plus some separators for parsing. A more logical order is (1)www.w3.org (2)http: (3)TR/xhtml-basic/xhtml-basic10.dtd because first you connect to a machine, then use a the protocol with the file. Lets look at each part, starting with (3). TR/xhtml-basic/xhtml-basic10.dtd is just a name. It is a key that is used in a table to retrieve a file. The separators are [1]almost meaningless, and if the server wanted the separators could be : instead of /, or have no structure whatsoever. (2) is the protocol. But because URL's are uniform, the meaning of (2) is irrelevant to us. It might as well be quix: or I:. The protocol is only important to programmers. As a programmer, I found it hard to ignore the fact that this is a protocol, but once you do, you realize it is like part of the name. Consider a DOS-like file system. If you have a CD-ROM on say I:, then I have to use a different protocol to access the CD-ROM than you would use to access a floppy drive on A:. But as a user, you don't care and it is really transparent to you. It is just a key that returns a table, then you send another key to (the file name) to get the document. But you can just pretend that the protocol + the file name is one big key for one big table on the machine (1) www.w3.org This brings us to (1), the most important part. The part that seems to really make URL's a location an not a document name. But the reality is there is no machine named www.w3.org. There are machines named slow1.w3.org, and slow2.w3.org. These are the machines that actually serve the table that maps (2) + (3) to a documents. So www.w3.org isn't a machine, it is a virtual machine name. So it is a name that always maps to a machine that is guaranteed to use the table generated by (2) + (3) to map keys to documents. An important point is that the W3C owns all names in (1) of the form ...w3.org. Now lets look at -//W3C//DTD XHTML Basic 1.0//EN It has 3 parts too: (1) -//W3C (2) DTD XHTML Basic 1.0 (3) EN Plus separators for parsing. (Actually 1&2 have 2 parts to them.) (3) just means the document is the English. We can consider it part of the name (2). (2) is just a key in table. DTD just means that it is a DTD, XHTML Basics 1.0 is the name of the DTD, and EN means the comments are in English. But (2) + (3) is the same as TR/xhtml-basic/xhtml-basic10.dtd. The .dtd says it's a DTD, lots of use of xhtml-basic. There is not mark saying it is in English. One could use use .en.dtd as an extension, but since the document will only be English, thus omitting it in the URL doesn't matter. www.w3.org corresponds to -//W3C. Both indicate the the following name (TR/xhtml-basic/xhtml-basic10.dtd or DTD XHTML Basic 1.0) is to be interpreted as a key for the a table controlled by the W3C. But the W3C doesn't actually doesn't own -//W3C like it owns www.w3.org, and anyone can make a document with the FPI -//W3C. So really URL's a better in this respect. So surprisingly, the URL is actually independent of machine name (because of virtual machine names) and independent of protocol (because of uniformity). Note this argument applies specifically to the W3C, because the own www.w3.org. I personal don't own a domain name, so I'd be better of using FPI's (-//Russell O'Connor//DTD ...) for my DTDs. [1] The /'s are important because /foo/./../bar is equivalent to /bar, so the keys are really strings modulo this equivalence relation. -- Russell O'Connor roconnor@uwaterloo.ca <http://www.undergrad.math.uwaterloo.ca/~roconnor/> ``Paradoxically, a refusal to `put a monetary value on life' means that life is often undervalued.'' -- Artificial Intelligence: A Modern Approach
Received on Friday, 18 February 2000 11:51:13 UTC